[Linux-cluster] qdiskd not properly failing nodes??
danwest at comcast.net
danwest at comcast.net
Tue Sep 12 16:12:30 UTC 2006
Below is the qdisk configuration for a simple 2 node cluster with a webserver services. The service is configured with 3 heuristics below.
<quorumd interval="1" tko="10" votes="3" device="/dev/disk/by-name/36006048000018794040051524d304234"
status_file="/tmp/qdisk_status" log_level="7">
<heuristic program="ping X.X.X.X -c1 -t1" score="1" interval="2"/>
<heuristic program="[ -f /quorum ]" score="1" interval="2"/>
<heuristic program="curl -s http://X.X.X.X | grep CLUSTER >> /dev/null" score="2"
interval="1"/>
</quorumd>
# cat /tmp/qdisk_status
Node ID: 1
Score (current / min req. / max allowed): 4 / 2 / 4
Current state: Master
Current disk state: None
Visible Set: { 1 2 }
Master Node ID: 1
Quorate Set: { 1 2 }
Causing the last 2 heuristics to fail causes the score to fall below ½ and in theory should reboot the node. So far I get confirmation in /var/log/messages but no actual reboot ( See below ). The service (webserver) also remains on the node that dropped below ½.
# cat /tmp/qdisk_status
Node ID: 1
Score (current / min req. / max allowed): 1 / 2 / 4
Current state: None
Current disk state: None
Visible Set: { 1 2 }
Master Node ID: 2
Quorate Set: { }
/var/log/messages
Sep 12 11:34:02 SERVER1 qdiskd[7495]: <notice> Score insufficient for master operation (1/2; max=4); downgrading
Sep 12 11:34:04 SERVER1 qdiskd[7495]: <info> Node 2 is the master
Sep 12 11:34:02 SERVER2 qdiskd[9780]: <info> Node 1 shutdown
Sep 12 11:34:02 SERVER2 qdiskd[9780]: <debug> Making bid for master
Sep 12 11:34:03 SERVER2 qdiskd[9780]: <info> Assuming master role
Any idea why the server is not getting rebooted/fenced?
Thanks,
Dan
More information about the Linux-cluster
mailing list