[Linux-cluster] kernel panic - help!

German Staltari gstaltari at arnet.net.ar
Wed Jun 21 19:50:07 UTC 2006


David Teigland wrote:
> On Wed, Jun 21, 2006 at 03:41:58PM -0300, German Staltari wrote:
>   
>> Jun 21 14:59:17 qmail-be-04 kernel: CMAN: removing node qmail-be-02 from 
>> the cluster : Missed too many heartbeats
>> Jun 21 14:59:23 qmail-be-04 kernel: CMAN: removing node qmail-be-01 from 
>> the cluster : No response to messages
>> Jun 21 14:59:29 qmail-be-04 kernel: CMAN: removing node qmail-be-06 from 
>> the cluster : No response to messages
>> Jun 21 14:59:39 qmail-be-04 kernel: CMAN: removing node qmail-be-03 from 
>> the cluster : No response to messages
>> Jun 21 14:59:46 qmail-be-04 kernel: CMAN: removing node qmail-be-05 from 
>> the cluster : No response to messages
>> Jun 21 14:59:52 qmail-be-04 kernel: CMAN: quorum lost, blocking activity
>> Jun 21 14:59:52 qmail-be-04 kernel: CMAN: node qmail-be-04 has been 
>> removed from the cluster : No response to messages
>> Jun 21 14:59:52 qmail-be-04 kernel: CMAN: killed by NODEDOWN message
>> Jun 21 14:59:52 qmail-be-04 kernel: CMAN: we are leaving the cluster. No 
>> response to messages
>>     
>
> This is what led to the gfs panic, the cluster shut down when it lost
> contact with all the other nodes.
>
> Dave
>
>   
Ok, but this node lost contact with the cluster because all the other 
nodes get the same panic at the same time.
We had another panic a few minutes ago... 3rd panic today... the same 
logs output...

Jun 21 16:13:55 qmail-be-01 kernel: lock_dlm:  Assertion failed on line 
357 of file /soft/kernel/cluster-1.02.00/gfs-kernel/src/dlm/lock.c
Jun 21 16:13:55 qmail-be-01 kernel: lock_dlm:  assertion:  "!error"
Jun 21 16:13:55 qmail-be-01 kernel: lock_dlm:  time = 951351
Jun 21 16:13:55 qmail-be-01 kernel: mstore008-004: error=-22 
num=2,75c6db lkf=10000 flags=84
Jun 21 16:13:55 qmail-be-01 kernel:
Jun 21 16:13:55 qmail-be-01 kernel: ------------[ cut here ]------------
Jun 21 16:13:55 qmail-be-01 kernel: kernel BUG at 
/soft/kernel/cluster-1.02.00/gfs-kernel/src/dlm/lock.c:357!
Jun 21 16:13:55 qmail-be-01 kernel: invalid opcode: 0000 [#1]
Jun 21 16:13:55 qmail-be-01 kernel: SMP
Jun 21 16:13:55 qmail-be-01 kernel: Modules linked in: nfsd exportfs 
lockd nfs_acl sunrpc gfs lock_dlm lock_harness dlm cman dm_round_robin 
dm_multipath ipv6 ohci_hcd i2c_piix4 i2c_core e1000 sg ext3 jbd dm_mod 
qla2300 qla2xxx scsi_transport_fc mptspi mptscsih mptbase sd_mod scsi_mod
Jun 21 16:13:55 qmail-be-01 kernel: CPU:    6
Jun 21 16:13:55 qmail-be-01 kernel: EIP:    0060:[<f90254d8>]    
Tainted: GF     VLI
Jun 21 16:13:55 qmail-be-01 kernel: EFLAGS: 00010296   (2.6.16.11-gds #1)
Jun 21 16:13:55 qmail-be-01 kernel: EIP is at do_dlm_unlock+0xd1/0xe5 
[lock_dlm]
Jun 21 16:13:55 qmail-be-01 kernel: eax: 00000004   ebx: 00000084   ecx: 
ffffebd8   edx: 00000000
Jun 21 16:13:55 qmail-be-01 kernel: esi: 00010000   edi: ffffffea   ebp: 
ca4265c0   esp: d741eef4
Jun 21 16:13:56 qmail-be-01 kernel: ds: 007b   es: 007b   ss: 0068
Jun 21 16:13:56 qmail-be-01 kernel: Process gfs_glockd (pid: 1061, 
threadinfo=d741e000 task=d6b40550)
Jun 21 16:13:56 qmail-be-01 kernel: Stack: <0>f902b673 f53267e0 ffffffea 
00000002 0075c6db 00000000 00010000 00000084
Jun 21 16:13:56 qmail-be-01 kernel:        00000002 f9732000 00000003 
ca4265c0 cec8a4ac f902552e f905e6b5 cec8a4dc
Jun 21 16:13:56 qmail-be-01 kernel:        cec8a4c8 cec8a4dc f9055f02 
00000296 000000d0 f9732000 f9089ee0 c539f9c0
Jun 21 16:13:56 qmail-be-01 kernel: Call Trace:
Jun 21 16:13:56 qmail-be-01 kernel:  [<f902552e>] 
lm_dlm_unlock+0x14/0x1c [lock_dlm]
Jun 21 16:13:56 qmail-be-01 kernel:  [<f905e6b5>] 
gfs_lm_unlock+0x2c/0x47 [gfs]
Jun 21 16:13:56 qmail-be-01 kernel:  [<f9055f02>] 
gfs_glock_drop_th+0x84/0x182 [gfs]
Jun 21 16:13:56 qmail-be-01 kernel:  [<f9054817>] run_queue+0x348/0x374 
[gfs]
Jun 21 16:13:56 qmail-be-01 kernel:  [<f90541a4>] 
handle_callback+0xe6/0x120 [gfs]
Jun 21 16:13:56 qmail-be-01 kernel:  [<f905485e>] 
unlock_on_glock+0x1b/0x24 [gfs]
Jun 21 16:13:56 qmail-be-01 kernel:  [<f905441b>] 
gfs_reclaim_glock+0xbc/0x170 [gfs]
Jun 21 16:13:56 qmail-be-01 kernel:  [<c031db3e>] _spin_lock_irqsave+0x9/0xd
Jun 21 16:13:56 qmail-be-01 kernel:  [<f9047bca>] gfs_glockd+0xda/0xff [gfs]




More information about the Linux-cluster mailing list