[Linux-cluster] Remove the clusterness from GFS

Jayson Vantuyl jvantuyl at engineyard.com
Thu Jan 11 06:09:53 UTC 2007


> It's not that we discriminate against cluster software :). We just  
> have
> some worries about the potential impact the cluster suite could  
> bring to
> the system. Extra CPU and memory cost is ok, we can consider that's  
> part
> of running GFS. The part that gets us wonder is any potential  
> behavioral
> changes and instability to the system. After all, the system is
> effectively tunrned into a cluster. I read some of the emails in the
> alias about cluster issues aside from GFS.

Behavior aside, with full understanding of how it works, clustering  
is neither complex nor particularly troublesome.  Understand that the  
instability you read about comes not from the clustering but rather  
is the nature of sharing these resources between multiple machines.

I operate over 40 clusters with a total of well over 100 nodes and I  
can assure you that the day I implemented comprehensive fencing (i.e.  
removed fence_manual and wrote a fencing agent for our platform) was  
very likely the best day of my life.  Fencing is what makes a GFS  
cluster reliable.

> For instance, we support hot removal/insertion of nodes in the system,
> I'm not clear how fencing will get in the way. We're not planning  
> to add
> any fencing hardware, and most likely will set fencing mechanism as
> manual. Ideally, we'd like to disable fencing except the part that is
> needed for running GFS.
There are issues.  As long as you don't change to a two-node cluster  
at some point (going from 1 node to 2 nodes or 3 nodes to 2 nodes)  
you should be able to achieve this.  In my personal opinion, I would  
avoid running GFS on less than 3 nodes anyways (again, 2-node  
clusters exhibit behavior that is easily avoidable with a third box,  
even if it doesn't use the GFS).

In a controlled manner it is possible to unmount the FS, leave the  
cluster, then change the cluster composition from a still running node.

Adding isn't too much trouble.

In either case I suggest a quorum disk (qdisk).

As for uncontrolled crashes fencing is absolutely necessary to  
recover state of the FS.

Complete fencing is absolutely necessary for running GFS.

Suggesting that you don't need (indeed want) fencing is an indication  
that you don't understand how GFS will share your data.  Relying on  
manual fencing is a sign that you will likely lose a great deal of  
data someday.  Redhat won't even support that configuration due to  
liability concerns.

Fencing only makes sure that a machine that has lost contact with the  
cluster does not trash your data.

Without fencing, a node that is out of control can (and will) trash  
your GFS.  This will result in the downtime required to shut down the  
cluster, fsck the filesystem, and then bring it back up.  It will  
also still likely trash some data.

Make no mistake, when fencing occurs, the system is already behaving  
badly.  It fixes it, albeit brutally.

With fence_manual, when you have any sort of outage whatsoever, one  
node will be hosed and the entire cluster will halt.  At this point  
you will do one of three things:

1.  You may just restart the entire cluster.

or

2.  You may correctly make sure the dead machine is truly dead.  YOU  
WILL NOT BE ABLE TO DO THIS REMOTELY WITHOUT HARDWARE SUITABLE FOR  
FENCING.  At that point you will call fence_ack_manual (manually) to  
free up the cluster.

or

3.  You may, in your haste, run fence_ack_manual to free up the  
cluster.  If at any point the other node is not completely dead, your  
data may be forfeit.  Worse, it may not be visible immediately, only  
after it is widely corrupted.  At that point you will probably get  
the downed node running without realizing what damage you may have done.

In the meantime, everyone mounting your GFS will be hung.  A single  
hardware failure can freeze your cluster.  Totally.

Note that to take the only path that saves your data (#2) you will  
have to have remote power switches or the like to reset a toasted  
node in all cases.  So you will NOT save yourself any money and yet  
you WILL create trouble.  Also, have you considered fencing at your  
network switch (for networked storage) or at your storage device  
itself?  It is not always necessary to purchase remote power switches  
to fence your data.

If you are not able to abide fencing, you probably should farm this  
out to someone who can.

Fencing is the way to avoid the bad behavior you have read about.  It  
is not the cause of trouble--it's the solution.  GFS absolutely must  
have it in its entirety or no dice.

If you would like a more official, professional explanation as to why  
this is absolutely, unequivocally necessary, contact me by e-mail.   
I'll call you.  I could fly out.  I can even give you a report with a  
letterhead and everything.

However, removing fencing from GFS is not a possibility.  It's not  
even really hard.

-- 
Jayson Vantuyl
Systems Architect
Engine Yard
jvantuyl at engineyard.com


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070111/0a0626a0/attachment.htm>


More information about the Linux-cluster mailing list