[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] RHEL5.0 Cluster fencing problems involving bonding



doobs72 _ wrote:

Hi

I’m having fencing problems in my 3 node cluster running on RHEL5.0 which involves bonding.

I have 3 severs A, B & C in a cluster with bonding configured on eth2 & eth3 for my cluster traffic. The config is as below:

DEVICE=eth2

BOOTPROTO=none

ONBOOT=yes

TYPE=Ethernet

MASTER=bond1

SLAVE=yes

USRCTL=no

DEVICE=eth3

BOOTPROTO=none

ONBOOT=yes

TYPE=Ethernet

MASTER=bond1

SLAVE=yes

USRCTL=no

DEVICE=bond1

IPADDR=192.168.x.x

NETMASK=255.255.255.0

NETWORK=192.168.x.0

BROADCAST=192.168.x.255

ONBOOT=YES

BOOTPROTO=none

The /etc/modprobe.conf file is configured as below:

alias eth0 bnx2

alias eth1 bnx2

alias eth2 e1000

alias eth3 e1000

alias eth4 e1000

alias eth5 e1000

alias scsi_hostadapter cciss

alias bond0 bonding

options bond0 miimon=100 mode=active-backup max_bonds=3

alias bond1 bonding

options bond1 miimon=100 mode=active-backup

alias bond2 bonding

options bond2 miimon=100 mode=active-backup

alias scsi_hostadapter1 qla2xxx

alias scsi_hostadapter2 usb-storage

The cluster starts up OK, however when I try to test the bonded interfaces my troubles begin.

On Node C if I "ifdown bond1", the node C, is fenced and everything works as expected.

However if on Node C, I take down the interfaces one at a time i.e. "ifdown eth2", - the cluster stays up as expected using eth3 for routing traffic "ifdown eth3" then node C is fenced by Node A. However in the /var/log/messages file on Node C I see a message saying that Node B will be fenced. The outcome is Nodes C & B are fenced.

My question is why does node B get fenced as well?


Hello,

First of all, You have the problem with bonding. Switch off the cluster, and investigate why when You do "ifdown eth3" the cluster goes down. I suspect that the problem is with e1000 driver. I suppose that C is the master of the cluster and it is faster than election of new master(of A,B). You could identify the master by: i=`cman_tool services | grep -A 1 default | tail -1 | sed -e 's/\[\(.\).*/\1/'`; cman_tool nodes | awk '{print $1,$5}' | grep "^$i" To resolve this issue You need to use more than one communication medium fe. ethernet or disk quorum if You have one?

Best Regards
Maciej Bogucki



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]