[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] Graceful Degradation

--- gordan bobich net wrote:

> Hi,
> I've got most of my cluster pretty much sorted out,
> apart from kicking 
> nodes from the cluster when they fail.
> Is there a way to make the node-kicking automated? I
> have 4 nodes. They 
> are sharing 2 GFS file systems, a root FS and a data
> FS. If I pull the 
> network cable from one of them, or just power it
> off, the rest of the 
> cluster nodes just stop. The only way to get them to
> start responding 
> again is to bring the missing node back, even if
> there are still enough 
> nodes to maintain quorum (3 nodes out of 4).
> Can anyone suggest a way around this? How can I make
> the 3 remaining nodes 
> just kick the missing node out of the cluster and
> DLM group (possibly 
> after some timeout, e.g. 10 seconds) and resume
> operation until the node 
> rejoins?
> This may or may not be related to the fact that I'm
> running a shared GFS 
> root, but any pointers would be welcome.
I thinks this is question #1 in the FAQs and in this
list :-)

the short anwser and the first place to look at is: 
1- fencing not configured or configured as manual
2- fencing problems, the devices not working as they


RedHat Certified ( RHCE )
Cisco Certified ( CCNA & CCDA )

Be a better friend, newshound, and 
know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ 

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]