[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] how to recover from process_recovery_barrier status=-104



On Thu, Jul 21, 2005 at 11:51:21PM -0400, Dan B. Phung wrote:
> My cluster went down pretty hard, in that I had to hard reboot several
> machines, and now the fence daemon won't come up.  I run:
> 
> $ ccsd && cman_tool join -w 
> $ fence_tool join -w -j 15 -D
> blade02:~ # fence_tool join -w -D -j 15
> fence_tool: wait for quorum 1
> fence_tool: get our node name
> fence_tool: connect to ccs
> fence_tool: start fenced
> fenced: 1122003465 our name from cman "blade02"

This is inconsistent with the data below which shows that blade1 is a
cluster member, not blade2.  Maybe you collected the other data before
blade2 joined the cluster...

> blade13:~ # cman_tool nodes
> Node  Votes Exp Sts  Name
>    1    1    1   M   blade01
>    2    1    1   X   blade02
>    3    1    1   X   blade03
>    4    1    1   X   blade04
>    6    1    1   X   blade06
>    7    1    1   X   blade07
>    8    1    1   X   blade08
>    9    1    1   X   blade09
>   10    1    1   X   blade10
>   11    1    1   X   blade11
>   12    1    1   X   blade12
>   13    1    1   M   blade13
>   14    1    1   X   blade14
> 
> blade13:~ # cman_tool status
> Protocol version: 5.0.1
> Config version: 1
> Cluster name: blade_cluster
> Cluster ID: 38068
> Cluster Member: Yes
> Membership state: Cluster-Member
> Nodes: 2
> Expected_votes: 1
> Total_votes: 2
> Quorum: 2   
> Active subsystems: 6
> Node name: blade13
> 
> blade13:~ # cman_tool services
> Service          Name                              GID LID State     Code
> Fence Domain:    "default"                           1   2 recover 2 -
> [13]

This looks like blade13 is trying to fence some node.  blade13 won't let
anyone else join the fence domain until it's completed the fencing; this
is probably why fenced on blade02 isn't getting anywhere.
/var/log/messages on blade13 should show where or if there's an incomplete
fencing operation.

Dave


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]