[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] GFS on 3-node cluster corrupted after full network outage

To my knowledge, yes and no.

GFS will continue to run but will be unable to do any locking.

This has interesting behavior.  Specifically, if a node has a lock on something in the GFS, it will continue to modify it.  It will continue to journal its changes too (since it locks a journal upon mounting).  It will not touch anything new because it won't be able to acquire locks on new parts of the GFS.  Any calls waiting on this will hang.

So, the GFS will still be potentially modified, but not corrupted.  In fact, whenever a quorate subset of nodes eventually forms due to fencing or an administrator intervening somehow, it should initially fence the other nodes.  After fencing, the remaining quorate nodes (at least the ones mounting the GFS) will scan the journals on the GFS for uncommitted transactions and commit them (it may roll them back if appropriate, not sure about the details here).  So, assuming working fencing, this can't ever corrupt the GFS even though modifications still continue to the filesystem.  Despite the complexity, I believe this is actually very good behavior.

Jayson Vantuyl
Systems Architect
Engine Yard

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]