[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] fence_apc unknown screen encountered



Hi,

Can you test our new fence agent for APC? You can find it in git repository (available also from web http://git.fedorahosted.org/git/cluster.git) in branch RHEL5 (also in master, stable2, ...). Just download package pexpect, fence/agents/apc/fence_apc.py and fence/agents/lib/fencing.py.py (rename to fencing.py); and try to run it from command-line (./fence_apc.py -h). Please let me know the results :)

marx,

Matt Harrington wrote:
I can pinpoint the problem with verbose logging. For some reason, fence_apc repeats the outlet selection menu option. On an outlet number > 2, this is harmless, but in the case where the outlet <= 2, the script horks. Here is the output on a working call to illustrate the duplicate selection; "13" is entered twice:

^M------- Outlet Control/Configuration ------------------------------------------

    1- Outlet 1                 ON
    2- build                    ON
    3- www103                   ON
    4- www102                   ON
    5- Outlet 5                 ON
    6- Outlet 6                 ON
    7- Outlet 7                 ON
    8- fs102                    ON
    9- build                    ON
   10- app102                   ON
   11- Outlet 11                ON
   12- db103                    ON
   13- fs103                    ON
   14- Outlet 14                ON
   15- Outlet 15                ON
   16- Outlet 16                ON
   17- Master Control/Configuration

    <ESC>- Back, <ENTER>- Refresh, <CTRL-L>- Event Log
> 13

^M------- fs103 -----------------------------------------------------------------

       Name         : fs103
       Outlet       : 13
       State        : ON

    1- Control Outlet       2- Configure Outlet
    ?- Help, <ESC>- Back, <ENTER>- Refresh, <CTRL-L>- Event Log
> 13

^M------- fs103 -----------------------------------------------------------------

       Name         : fs103
       Outlet       : 13
       State        : ON

    1- Control Outlet       2- Configure Outlet
    ?- Help, <ESC>- Back, <ENTER>- Refresh, <CTRL-L>- Event Log
> 1


Matt Harrington wrote:
I am encountering an unknown screen exception from fence_apc when trying to fence a system in a 3-node cluster (centos5.2 cman-2.0.84-2.el5). What is interesting, is that I can fence the other two nodes in my cluster. I believe the difference is that the problem node has two power supplies which means that fence_apc is called with off/on instead of restart. This also requires connecting to two different pdus. It could also be that there is something wrong with the config which was taken from an older system and updated with luci. I am unable to descern any differences between the menus of the two pdus.



[root fs102 ~]# /sbin/fence_node fs103
agent "fence_apc" reports: Traceback (most recent call last):
 File "/sbin/fence_apc", line 829, in ?
   main()
 File "/sbin/fence_apc", line 303, in main
   do_power_off(sock)
 File "/sbin/fence_apc", line 813, in do_power_off
   x = do_power_switch(sock, "off")
 File "/sbi
agent "fence_apc" reports: n/fence_apc", line 611, in do_power_switch
   result_code, response = power_off(txt + ndbuf)
 File "/sbin/fence_apc", line 817, in power_off
   x = power_switch(buffer, False, "2", "3");
 File "/sbin/fence_apc", line 810, in power_switch
   raise "un
agent "fence_apc" reports: known screen encountered in \n" + str(lines) + "\n"
unknown screen encountered in
['', '> 2', '', '', '------- Configure Outlet ------------------------------------------------------', '', ' # State Ph Name Pwr On Dly Pwr Off D agent "fence_apc" reports: ly Reboot Dur.', ' ----------------------------------------------------------------------------', ' 2 ON 1 fs103 0 sec 0 sec 5 sec', '', ' 1- Outlet Name : fs103', ' 2- Power On Delay(sec) : 0', agent "fence_apc" reports: ' 3- Power Off Delay(sec): 0', ' 4- Reboot Duration(sec): 5', ' 5- Accept Changes : ', '', ' ?- Help, <ESC>- Back, <ENTER>- Refresh, <CTRL-L>- Event Log']


[root fs102 ~]# /sbin/fence_apc -a 10.10.1.200 -l pdu -p pdu -n 13 -o status
Status check successful. Port 13 is OFF
[root fs102 ~]# /sbin/fence_apc -a 10.10.1.201 -l pdu -p pdu -n 2 -o status
Status check successful. Port 2 is ON
[root fs102 ~]# /sbin/fence_apc -a 10.10.1.201 -l pdu -p pdu -n 2 -o off
Traceback (most recent call last):
 File "/sbin/fence_apc", line 829, in ?
   main()
 File "/sbin/fence_apc", line 303, in main
   do_power_off(sock)
 File "/sbin/fence_apc", line 813, in do_power_off
   x = do_power_switch(sock, "off")
 File "/sbin/fence_apc", line 611, in do_power_switch
   result_code, response = power_off(txt + ndbuf)
 File "/sbin/fence_apc", line 817, in power_off
   x = power_switch(buffer, False, "2", "3");
 File "/sbin/fence_apc", line 810, in power_switch
   raise "unknown screen encountered in \n" + str(lines) + "\n"
unknown screen encountered in
['2', '', '', '------- Configure Outlet ------------------------------------------------------', '', ' # State Ph Name Pwr On Dly Pwr Off Dly Reboot Dur.', ' ----------------------------------------------------------------------------', ' 2 ON 1 fs103 0 sec 0 sec 5 sec', '', ' 1- Outlet Name : fs103', ' 2- Power On Delay(sec) : 0', ' 3- Power Off Delay(sec): 0', ' 4- Reboot Duration(sec): 5', ' 5- Accept Changes : ', '', ' ?- Help, <ESC>- Back, <ENTER>- Refresh, <CTRL-L>- Event Log']




<cluster config_version="143" name="gfs_cluster">
<fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>
   <clusternodes>
       <clusternode name="fs101" nodeid="1" votes="1">
           <fence>
               <method name="1">
                   <device name="pdu102.eons.dev" port="12"/>
               </method>
           </fence>
       </clusternode>
       <clusternode name="fs102" nodeid="2" votes="1">
           <fence>
               <method name="1">
                   <device name="pdu101.eons.dev" port="8"/>
               </method>
           </fence>
       </clusternode>
       <clusternode name="fs103" nodeid="3" votes="1">
           <fence>
               <method name="1">
<device name="pdu101.eons.dev" option="off" port="13"/> <device name="pdu102.eons.dev" option="off" port="2"/> <device name="pdu101.eons.dev" option="on" port="13"/>
                   <device name="pdu102.eons.dev" option="on" port="2"/>
               </method>
           </fence>
       </clusternode>
   </clusternodes>
       <fencedevices>
<fencedevice agent="fence_apc" ipaddr="10.10.1.200" login="pdu" name="pdu101.eons.dev" passwd="pdu"/> <fencedevice agent="fence_apc" ipaddr="10.10.1.201" login="pdu" name="pdu102.eons.dev" passwd="pdu"/>
       </fencedevices>
...
</cluster>




[root fs102 ~]# cat /etc/redhat-release
CentOS release 5.2 (Final)
[root fs102 ~]# rpm -qf /sbin/fence_apc
cman-2.0.84-2.el5
[root fs102 ~]# rpm -q luci
luci-0.12.0-7.el5.centos.3


pdu101:
American Power Conversion Network Management Card AOS v3.5.9 (c) Copyright 2008 All Rights Reserved Rack PDU APP v3.5.8

pdu102:
American Power Conversion Network Management Card AOS v3.5.9 (c) Copyright 2008 All Rights Reserved Rack PDU APP v3.5.8

--
Linux-cluster mailing list
Linux-cluster redhat com
https://www.redhat.com/mailman/listinfo/linux-cluster

--
Linux-cluster mailing list
Linux-cluster redhat com
https://www.redhat.com/mailman/listinfo/linux-cluster


--
Marek Grac
Red Hat Czech s.r.o.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]