Re: [Linux-cluster] I give up

On Wed, 2007-11-28 at 17:27 -0500, James Parsons wrote:
> Scott Becker wrote:
> > Case two. I remove one node from the cluster to maintain it. Now I 
> > have a two node cluster. Same issues as above. Luci wants to set 
> > two_node = 1 in this case instead of just dealing with expected votes 
> > = 1. 
> I know why Luci is doing this -- she sees the cluster reduced from three 
> nodes to two nodes and configures it (as the large majority or our 
> typical users consider) appropriately. When you are finished maintaining 
> the node and you tell Luci to add it back in to the cluster, she will 
> remove those configuration attributes.
> The sticking point seems to be your particular desired cluster behavior 
> and the fact that it lies outside of what was expected for cluster suite.
> If this is not appropriate behavior for you, then don't use Luci. You 
> are free to use a text editor on the cluster.conf file and propagate it 
> manually via the command line on one of the nodes, as you are free to 
> edit the source code and add ssh support to your favorite fence agent. 
> You are free to go off the map, and the members of this list (including 
> many of the Red Hat engineers who write cluster code and watch this 
> list) will assist you in your expedition as much as possible. We will 
> all try our best to help you get where you want to go (and I think you 
> would have to agree that you have had a very respectable response rate 
> for your queries this last month - many have tried to offer you 
> assistance), but if we can't think of a way to stretch the software to 
> your needs, then we just can't.

Everyone here has been very helpful to me during my trials and
tribulations. I'm pretty sure I would have bailed without it. But there
is one thing I think would greatly help a lot of people - myself for
sure - and that is accurate and complete documentation of the totality
of the cluster conf file - with lots of example config files. The one
schema for 4.x and 5.x doc I have found is incomplete. All documentation
I have seen regarding setup and admin warns the user repeatedly NOT to
edit the configuration file yourself! But the fact is, using the GUI
apps, both conga and piranha, is *excruciatingly* slow and painful.
That, and it does not help you understand how it actually works.

Frankly, and not to hurt anybody's feelings, or diminish the effort that
has gone into the GUI projects, but manual editing of the file is the
only reasonable way to set this up. Complete information on all of the
tags and their precise meanings and organization would be way more
valuable than a GUI that does not really work all that well. I'm betting
it would take less developer time too.

Heck, If I can get all the data, I will write the doc and put it on the
GFS wiki. That wiki is a great start. Lon, James or anyone - feel free
to send me all conf file info you have and example configs and I will
try to distill it all into something helpful.

> I do want to disagree strongly, however, with your blanket suggestion 
> that this software is not complete, and is not a cluster solution. It is 
> a solution for many, many users...not all of whom are RH customers. It 
> is just not a solution for you, my friend.
> Thanks for your many constructive comments. I hope you keep trying the 
> software - we are here to help as best we can. I haven't given up on you 
> *quite* yet! :)
> -J

As it stands, I've plodded through, and have a 6-node virtual cluster
that spans 2 ESX servers, that can stay up to the last man, (using a
quorum disk), serving desktops for developers via vnc and xdmcp, and ssh
sessions and nfs mounts - all load balanced through HA directors. Pretty
damn sweet if I do say so myself ;) 
I had to write the fence script for it, and replace nanny because it
continuously segfaulted, but once a GFS Linux cluster is finally
configured and running, it kicks serious butt, and it is indeed worth
it. The thing is, if everyone gave up, where would OpenSource be today?


