[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] Two-node cluster: Node attempts stateful merge after clean reboot



The problem is that, if you enable cman on boot, the fenced node will try to join the cluster, fail to reach it's peer after post_join_delay (default 6 seconds, iirc) and fence it's peer. That peer reboots, starts cman, tries to connect, fenced it's peer...

The easiest way to avoid this in 2-node clusters is to not let cman/rgmanager start automatically. That way, if a node is fenced, it will boot back up and you can log into remotely (assuming it's not totally dead). When you know things are fixed, manually start cman.

I my case however, the node which is trying to join is fully operational and has network access. Also if you look at the configuration that I had in my original email, my post_join_delay is 360 (for testing purposes), so there is no way that a timeout occurs.

I might be wrong here, but judging from corosync's log file, the other node even joins the cluster successfully, before being marked for fencing by dlm_controld:
Sep 11 11:14:09 corosync [CLM   ] CLM CONFIGURATION CHANGE
Sep 11 11:14:09 corosync [CLM   ] New Configuration:
Sep 11 11:14:09 corosync [CLM   ]     r(0) ip(10.xx.xx.1)
Sep 11 11:14:09 corosync [CLM   ]     r(0) ip(10.xx.xx.2)
Sep 11 11:14:09 corosync [CLM   ] Members Left:
Sep 11 11:14:09 corosync [CLM   ] Members Joined:
Sep 11 11:14:09 corosync [CLM   ]     r(0) ip(10.xx.xx.2)
Sep 11 11:14:09 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.
Sep 11 11:14:09 corosync [QUORUM] Members[2]: 1 2
Sep 11 11:14:09 corosync [QUORUM] Members[2]: 1 2


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]