[Linux-cluster] fence_apc unknown screen encountered

Marek 'marx' Grac mgrac at redhat.com
Wed Aug 27 13:22:57 UTC 2008


Hi,

Can you test our new fence agent for APC? You can find it in git 
repository (available also from web 
http://git.fedorahosted.org/git/cluster.git) in branch RHEL5 (also in 
master, stable2, ...). Just download package pexpect, 
fence/agents/apc/fence_apc.py and fence/agents/lib/fencing.py.py (rename 
to fencing.py); and try to run it from command-line (./fence_apc.py -h). 
Please let me know the results :)

marx,

Matt Harrington wrote:
> I can pinpoint the problem with verbose logging.  For some reason, 
> fence_apc repeats the outlet selection menu option.  On an outlet 
> number > 2, this is harmless, but in the case where the outlet <= 2, 
> the script horks.  Here is the output on a working call to illustrate 
> the duplicate selection; "13" is entered twice:
>
> ^M------- Outlet Control/Configuration 
> ------------------------------------------
>
>     1- Outlet 1                 ON
>     2- build                    ON
>     3- www103                   ON
>     4- www102                   ON
>     5- Outlet 5                 ON
>     6- Outlet 6                 ON
>     7- Outlet 7                 ON
>     8- fs102                    ON
>     9- build                    ON
>    10- app102                   ON
>    11- Outlet 11                ON
>    12- db103                    ON
>    13- fs103                    ON
>    14- Outlet 14                ON
>    15- Outlet 15                ON
>    16- Outlet 16                ON
>    17- Master Control/Configuration
>
>     <ESC>- Back, <ENTER>- Refresh, <CTRL-L>- Event Log
> > 13
>
> ^M------- fs103 
> -----------------------------------------------------------------
>
>        Name         : fs103
>        Outlet       : 13
>        State        : ON
>
>     1- Control Outlet       2- Configure Outlet
>     ?- Help, <ESC>- Back, <ENTER>- Refresh, <CTRL-L>- Event Log
> > 13
>
> ^M------- fs103 
> -----------------------------------------------------------------
>
>        Name         : fs103
>        Outlet       : 13
>        State        : ON
>
>     1- Control Outlet       2- Configure Outlet
>     ?- Help, <ESC>- Back, <ENTER>- Refresh, <CTRL-L>- Event Log
> > 1
>
>
> Matt Harrington wrote:
>> I am encountering an unknown screen exception from fence_apc when 
>> trying to fence a system in a 3-node cluster (centos5.2 
>> cman-2.0.84-2.el5).  What is interesting, is that I can fence the 
>> other two nodes in my cluster.  I believe the difference is that the 
>> problem node has two power supplies which means that fence_apc is 
>> called with off/on instead of restart.  This also requires connecting 
>> to two different pdus.  It could also be that there is something 
>> wrong with the config which was taken from an older system and 
>> updated with luci.  I am unable to descern any differences between 
>> the menus of the two pdus.
>>
>>
>>
>> [root at fs102 ~]# /sbin/fence_node fs103
>> agent "fence_apc" reports: Traceback (most recent call last):
>>  File "/sbin/fence_apc", line 829, in ?
>>    main()
>>  File "/sbin/fence_apc", line 303, in main
>>    do_power_off(sock)
>>  File "/sbin/fence_apc", line 813, in do_power_off
>>    x = do_power_switch(sock, "off")
>>  File "/sbi
>> agent "fence_apc" reports: n/fence_apc", line 611, in do_power_switch
>>    result_code, response = power_off(txt + ndbuf)
>>  File "/sbin/fence_apc", line 817, in power_off
>>    x = power_switch(buffer, False, "2", "3");
>>  File "/sbin/fence_apc", line 810, in power_switch
>>    raise "un
>> agent "fence_apc" reports: known screen encountered in \n" + 
>> str(lines) + "\n"
>> unknown screen encountered in
>> ['', '> 2', '', '', '------- Configure Outlet 
>> ------------------------------------------------------', '', '    #  
>> State  Ph  Name                     Pwr On Dly  Pwr Off D
>> agent "fence_apc" reports: ly  Reboot Dur.', '   
>> ----------------------------------------------------------------------------', 
>> '    2  ON     1   fs103                    0 sec       0 sec        
>> 5 sec', '', '     1- Outlet Name         : fs103', '     2- Power On 
>> Delay(sec) : 0',
>> agent "fence_apc" reports:  '     3- Power Off Delay(sec): 0', '     
>> 4- Reboot Duration(sec): 5', '     5- Accept Changes      : ', '', 
>> '     ?- Help, <ESC>- Back, <ENTER>- Refresh, <CTRL-L>- Event Log']
>>
>>
>> [root at fs102 ~]# /sbin/fence_apc -a 10.10.1.200 -l pdu -p pdu -n 13 -o 
>> status
>> Status check successful. Port 13 is OFF
>> [root at fs102 ~]# /sbin/fence_apc -a 10.10.1.201 -l pdu -p pdu -n 2 -o 
>> status
>> Status check successful. Port 2 is ON
>> [root at fs102 ~]# /sbin/fence_apc -a 10.10.1.201 -l pdu -p pdu -n 2 -o off
>> Traceback (most recent call last):
>>  File "/sbin/fence_apc", line 829, in ?
>>    main()
>>  File "/sbin/fence_apc", line 303, in main
>>    do_power_off(sock)
>>  File "/sbin/fence_apc", line 813, in do_power_off
>>    x = do_power_switch(sock, "off")
>>  File "/sbin/fence_apc", line 611, in do_power_switch
>>    result_code, response = power_off(txt + ndbuf)
>>  File "/sbin/fence_apc", line 817, in power_off
>>    x = power_switch(buffer, False, "2", "3");
>>  File "/sbin/fence_apc", line 810, in power_switch
>>    raise "unknown screen encountered in \n" + str(lines) + "\n"
>> unknown screen encountered in
>> ['2', '', '', '------- Configure Outlet 
>> ------------------------------------------------------', '', '    #  
>> State  Ph  Name                     Pwr On Dly  Pwr Off Dly  Reboot 
>> Dur.', '   
>> ----------------------------------------------------------------------------', 
>> '    2  ON     1   fs103                    0 sec       0 sec        
>> 5 sec', '', '     1- Outlet Name         : fs103', '     2- Power On 
>> Delay(sec) : 0', '     3- Power Off Delay(sec): 0', '     4- Reboot 
>> Duration(sec): 5', '     5- Accept Changes      : ', '', '     ?- 
>> Help, <ESC>- Back, <ENTER>- Refresh, <CTRL-L>- Event Log']
>>
>>
>>
>>
>> <cluster config_version="143" name="gfs_cluster">
>>    <fence_daemon clean_start="0" post_fail_delay="0" 
>> post_join_delay="3"/>
>>    <clusternodes>
>>        <clusternode name="fs101" nodeid="1" votes="1">
>>            <fence>
>>                <method name="1">
>>                    <device name="pdu102.eons.dev" port="12"/>
>>                </method>
>>            </fence>
>>        </clusternode>
>>        <clusternode name="fs102" nodeid="2" votes="1">
>>            <fence>
>>                <method name="1">
>>                    <device name="pdu101.eons.dev" port="8"/>
>>                </method>
>>            </fence>
>>        </clusternode>
>>        <clusternode name="fs103" nodeid="3" votes="1">
>>            <fence>
>>                <method name="1">
>>                    <device name="pdu101.eons.dev" option="off" 
>> port="13"/>
>>                    <device name="pdu102.eons.dev" option="off" 
>> port="2"/>
>>                    <device name="pdu101.eons.dev" option="on" 
>> port="13"/>
>>                    <device name="pdu102.eons.dev" option="on" port="2"/>
>>                </method>
>>            </fence>
>>        </clusternode>
>>    </clusternodes>
>>        <fencedevices>
>>                <fencedevice agent="fence_apc" ipaddr="10.10.1.200" 
>> login="pdu" name="pdu101.eons.dev" passwd="pdu"/>
>>                <fencedevice agent="fence_apc" ipaddr="10.10.1.201" 
>> login="pdu" name="pdu102.eons.dev" passwd="pdu"/>
>>        </fencedevices>
>> ...
>> </cluster>
>>
>>
>>
>>
>> [root at fs102 ~]# cat /etc/redhat-release
>> CentOS release 5.2 (Final)
>> [root at fs102 ~]# rpm -qf /sbin/fence_apc
>> cman-2.0.84-2.el5
>> [root at fs102 ~]# rpm -q luci
>> luci-0.12.0-7.el5.centos.3
>>
>>
>> pdu101:
>> American Power Conversion               Network Management Card 
>> AOS      v3.5.9
>> (c) Copyright 2008 All Rights Reserved  Rack PDU 
>> APP                     v3.5.8
>>
>> pdu102:
>> American Power Conversion               Network Management Card 
>> AOS      v3.5.9
>> (c) Copyright 2008 All Rights Reserved  Rack PDU 
>> APP                     v3.5.8
>>
>> -- 
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>
> -- 
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


-- 
Marek Grac
Red Hat Czech s.r.o.




More information about the Linux-cluster mailing list