[Cluster-devel] fence daemon problems
Dietmar Maurer
dietmar at proxmox.com
Wed Oct 3 09:25:08 UTC 2012
> I observe strange problems with fencing when a cluster loose quorum for a
> short time.
>
> After regain quorum, fenced reports 'wait state messages', and whole
> cluster is blocked waiting for fenced.
Just found the following in fenced/cpg.c:
/* This is how we deal with cpg's that are partitioned and
then merge back together. When the merge happens, the
cpg on each side will see nodes from the other side being
added, and neither side will have zero started_count. So,
both sides will ignore start messages from the other side.
This causes the the domain on each side to continue waiting
for the missing start messages indefinately. To unblock
things, all nodes from one side of the former partition
need to fail. */
So the observed behavior is expected?
More information about the Cluster-devel
mailing list