[Linux-cluster] Running GFS without fencing and maybe locking ; -)

Wed Mar 22 20:09:09 UTC 2006

On Wed, Mar 22, 2006 at 04:12:16PM +0100, Arnd wrote:
> Our cluster consists of 4 Webservers and one management server. This
> management server is the only server which needs write access to the GFS
> (for example changing the html-files):
> 
> 	webserver1 - webserver4: mount GFS -o ro (readonly)
> 	mgm-server1: mount GFS -o rw (write access)
> 
> My idea:
> 
> If one of the webserver fails then the cluster will issue an
> fence_script with an exitcode "0". The node is fenced by the cluster and
> while the filesystem wasn't mounted rw it cannot be destroyed.
> 
> The only possible way the filesystem can get corrupted is when the
> management-server fails.
> 
> So is it possible to run the GFS with 4 readonly nodes and only one node
>  which should be taken care if it fails? How does locking (lock_dlm)
> work  in this case? I suppose that it only needs to take care for any
> writes to the filesystem but here I might be wrong?!

In the next version of GFS we're adding a "spectator" mount option which
is very similar to "ro".  Spectator mounts do not claim a journal when
they mount, cannot be converted to "rw", and don't need to be fenced if
they fail.

If you guarantee that your ro mounts will never be remounted to rw, then
it should be safe to not fence them when they fail (since they'll never
under any circumstance make any writes to the fs).

The problem is not related to fencing, but comes when the node mounting rw
fails.  There are only ro mounts left and none of them can recover the
journal of the failed node because they can't write.  What these ro mounts
do next is the important part, and there appears to be a shortcoming in
the current code that I just noticed.  It looks like the ro mounts will
continue reading the fs normally without the journal of the failed rw node
ever being replayed.  They'll likely come across some inconsistent part of
the fs and panic/withdraw.  It shouldn't be difficult to test this.

It's not clear yet how difficult this problem will be to fix in the
current stable code.

> Can I use lock_nolock (when making the filesystem) if only one node is
> writing to the GFS?

No, readonly nodes still need to do all the necessary locking for reading.

Dave