[Linux-cluster] Re: [Cluster-devel] Bug on dlm

Jordi Prats jprats at cesca.es
Fri Sep 28 13:37:31 UTC 2007


Hi,
This bug could be causing this?


[root at inf17 ~]# clustat
Member Status: Inquorate

  Member Name                        ID   Status
  ------ ----                        ---- ------
  inf17                                 1 Online, Local
  inf18                                 2 Offline
  inf19                                 3 Offline


[root at inf18 ~]# clustat
Member Status: Quorate

  Member Name                        ID   Status
  ------ ----                        ---- ------
  inf17                                 1 Online
  inf18                                 2 Online, Local
  inf19                                 3 Offline


[root at inf17 ~]# group_tool
type             level name       id       state
fence            0     default    00010001 JOIN_START_WAIT
[1]
dlm              1     rgmanager  00020001 JOIN_ALL_STOPPED
[1]

[root at inf18 ~]# group_tool
type             level name       id       state
fence            0     default    00000000 JOIN_STOP_WAIT
[1 2]
dlm              1     rgmanager  00010002 JOIN_START_WAIT
[2]

[root at inf17 ~]# cman_tool status
Version: 6.0.1
Config Version: 4
Cluster Name: boumort
Cluster Id: 13356
Cluster Member: Yes
Cluster Generation: 3824
Membership state: Cluster-Member
Nodes: 1
Expected votes: 2
Total votes: 1
Quorum: 2 Activity blocked
Active subsystems: 7
Flags:
Ports Bound: 0
Node name: inf17
Node ID: 1
Multicast addresses: 239.192.52.96
Node addresses: 192.168.22.17


[root at inf18 ~]# cman_tool status
Version: 6.0.1
Config Version: 4
Cluster Name: boumort
Cluster Id: 13356
Cluster Member: Yes
Cluster Generation: 3820
Membership state: Cluster-Member
Nodes: 2
Expected votes: 2
Total votes: 2
Quorum: 2
Active subsystems: 7
Flags:
Ports Bound: 0 177
Node name: inf18
Node ID: 2
Multicast addresses: 239.192.52.96
Node addresses: 192.168.22.18


Patrick Caulfield wrote:
> Jordi Prats wrote:
>> Hi,
>> I've found this while starting my server. It's a F7 with the latest
>> version avaliable.
>>
>> Hope this helps :)
>>
>> Jordi
>>
>> Jul 26 23:52:51 inf18 kernel: dlm: rgmanager: recover 1
>> Jul 26 23:52:51 inf18 kernel: dlm: rgmanager: add member 2
>> Jul 26 23:52:51 inf18 kernel: dlm: rgmanager: total members 1 error 0
>> Jul 26 23:52:51 inf18 kernel: dlm: rgmanager: dlm_recover_directory
>> Jul 26 23:52:51 inf18 kernel: dlm: rgmanager: dlm_recover_directory 0
>> entries
>> Jul 26 23:52:51 inf18 kernel:
>> Jul 26 23:52:51 inf18 kernel: =====================================
>> Jul 26 23:52:51 inf18 kernel: [ BUG: bad unlock balance detected! ]
>> Jul 26 23:52:51 inf18 kernel: -------------------------------------
>> Jul 26 23:52:51 inf18 kernel: dlm_recoverd/2963 is trying to release
>> lock (&ls->ls_in_recovery) at:
>> Jul 26 23:52:51 inf18 kernel: [<ee67b874>] dlm_recoverd+0x265/0x433 [dlm]
>> Jul 26 23:52:51 inf18 kernel: but there are no more locks to release!
>> Jul 26 23:52:51 inf18 kernel:
> 
> Yeah, we know about it. It's not actually a bug, just the lockdep checking code
> being a little over-enthusiastic. Unfortunately there aren't any annotations
> available to make it quiet either.
> 
> The trick is to live with it, or to use kernels that have a little less
> debugging compiled in, which you would want to do for production anyway :)
> 
> 
> Patrick
> 
> 




More information about the Linux-cluster mailing list