[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

RE: [Linux-cluster] DLM locks with 1 node on 2 node cluster



Dave, here is the output:

*** 2 nodes are alive
[root bof227 ~]# cman_tool status; cman_tool services
Protocol version: 5.0.1
Config version: 84
Cluster name: MZ_CLUSTER
Cluster ID: 18388
Cluster Member: Yes
Membership state: Cluster-Member
Nodes: 2
Expected_votes: 1
Total_votes: 2
Quorum: 1
Active subsystems: 5
Node name: bof227
Node ID: 2
Node addresses: 10.14.32.227

Service          Name                              GID LID State     Code
Fence Domain:    "default"                           1   2 run       -
[2 1]

DLM Lock Space:  "clvmd"                             2   3 run       -
[2 1]

DLM Lock Space:  "default"                          11  10 run       -
[1 2]

User:            "usrm::manager"                     3   4 run       -
[2 1]

*** 1 node is alive (right after membership change detection)
[root bof227 ~]# cman_tool status; cman_tool services
Protocol version: 5.0.1
Config version: 84
Cluster name: MZ_CLUSTER
Cluster ID: 18388
Cluster Member: Yes
Membership state: Cluster-Member
Nodes: 1
Expected_votes: 1
Total_votes: 1
Quorum: 1
Active subsystems: 5
Node name: bof227
Node ID: 2
Node addresses: 10.14.32.227

Service          Name                              GID LID State     Code
Fence Domain:    "default"                           1   2 recover 2 -
[2]

DLM Lock Space:  "clvmd"                             2   3 recover 0 -
[2]

DLM Lock Space:  "default"                          11  10 recover 0 -
[2]

User:            "usrm::manager"                     3   4 recover 0 -
[2]

*** 1 node is alive (3 minutes later)
[root bof227 ~]# cman_tool status; cman_tool services
Protocol version: 5.0.1
Config version: 84
Cluster name: MZ_CLUSTER
Cluster ID: 18388
Cluster Member: Yes
Membership state: Cluster-Member
Nodes: 1
Expected_votes: 1
Total_votes: 1
Quorum: 1
Active subsystems: 5
Node name: bof227
Node ID: 2
Node addresses: 10.14.32.227

Service          Name                              GID LID State     Code
Fence Domain:    "default"                           1   2 recover 2 -
[2]

DLM Lock Space:  "clvmd"                             2   3 recover 0 -
[2]

DLM Lock Space:  "default"                          11  10 recover 0 -
[2]

User:            "usrm::manager"                     3   4 recover 0 -
[2]

*** Output of the DLM /proc/cluster*
[root bof227 cluster]# cat dlm_stats
DLM stats (HZ=1000)

Lock operations:         13
Unlock operations:        9
Convert operations:       0
Completion ASTs:         26
Blocking ASTs:            0

Lockqueue        num  waittime   ave
WAIT_RSB           2         0     0
WAIT_GRANT         7         2     0
WAIT_UNLOCK        3         0     0
Total             12         2     0

[root bof227 cluster]# cat dlm_debug
ault rebuild resource directory
default rebuilt 0 resources
default purge requests
default purged 0 requests
default mark waiting requests
default marked 0 requests
default purge locks of departed nodes
default purged 0 locks
default update remastered resources
default updated 0 resources
default rebuild locks
default rebuilt 0 locks
default recover event 42 done
default move flags 0,0,1 ids 41,42,42
default process held requests
default processed 0 requests
default resend marked requests
default resent 0 requests
default recover event 42 finished
default move flags 1,0,0 ids 42,42,42
default move flags 0,1,0 ids 0,44,0
default move use event 44
default recover event 44 (first)
default add nodes
default total nodes 2
default rebuild resource directory
default rebuilt 1 resources
default recover event 44 done
default move flags 0,0,1 ids 0,44,44
default process held requests
default processed 0 requests
default recover event 44 finished
clvmd move flags 1,0,0 ids 37,37,37
default move flags 1,0,0 ids 44,44,44

[root bof227 cluster]# cat dlm_dir
[root bof227 cluster]# cat dlm_locks
[root bof227 cluster]# cat sm_debug
t state 3
0100000b sevent state 5
0100000b sevent state 7
0100000b sevent state 9
00000001 remove node 1 count 1
0100000b remove node 1 count 1
01000002 remove node 1 count 1
03000003 remove node 1 count 1
00000001 recover state 0
00000001 recover state 1

-----Original Message-----
From: David Teigland [mailto:teigland redhat com] 
Sent: Monday, August 28, 2006 1:56 PM
To: Zelikov, Mikhail
Cc: linux-cluster redhat com
Subject: Re: [Linux-cluster] DLM locks with 1 node on 2 node cluster

On Mon, Aug 28, 2006 at 01:32:43PM -0400, Zelikov_Mikhail emc com wrote:
> I checked that I have to following in the cluster.conf <cman 
> expected_votes="1" two_nodes="1">

In that case, perhaps the remaining node is trying to fence the failed node
and not getting anywhere?  What do  'cman_tool status' and 'cman_tool
services' say after the one node has failed?

Dave


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]