[Linux-cluster] GFS volume locks during cluster node join/leave

Sat Mar 19 16:46:01 UTC 2011

On Fri, Mar 18, 2011 at 4:50 AM, Martijn Storck
<martijn.storck at gmail.com> wrote:

> Is this expected behaviour? Is there anything we can do to reduce these
> delays? We run 10 VMs on our active nodes.. it's a shame to have these all
> lock up because we're rebooting a passive node :)

Yes, It's because you did not gracefully unmount the filesystem as
Alan mentioned. Another node in the cluster halts access to the
filesystem and replays the journal from the dead node to make sure
that the filesystem is in a known state.

Generally, when I take down a cluster node, I manually remove it from
the cluster by stopping all services (stopping rgmanager and/or
unmounting filesystems), stopping clvmd if it's in use, running
fence_tool leave, then cman_tool leave. That "warns" the other cluster
nodes that this one is going away and they don't panic when it does :)
 In theory, this should happen during a normal shutdown, but I've seen
it not enough times to make me do the extra work.

This section of the Cluster wiki gives you a pretty good idea of what
happens when nodes join & leave the cluster:

http://sources.redhat.com/cluster/wiki/FAQ/CMAN#cman_tool_services

-- 
HTH, YMMV, HANW :)

Jason

The path to enlightenment is /usr/bin/enlightenment.