[Linux-cluster] segfault during cman_tool services
Dan B. Phung
phung at cs.columbia.edu
Thu May 19 17:48:20 UTC 2005
i've been having some problems with my fs where a node
will mysteriously be removed from the cluster, even though
the node is still up. here's what I see from syslog:
CMAN: node blade11 has been removed from the cluster : No response to messages
CMAN: killed by NODEDOWN message
CMAN: we are leaving the cluster. No response to messages
dlm: proj_lv: restbl_rsb_update failed -105
dlm: home_lv: rebuild_rsbs_send failed -105
so from blade11, I try to see what's going on, and when I do:
> cman_tool services
I get the fun pasted at the end of the message. A while back I noticed
there was some code updates/patches, but I don't know where to find the
"Changes". Would a cvs update on the sources help? Let me know if
you need more info on the system I'm running.
regards,
dan
--
lock_dlm: Assertion failed on line 353 of file
/usr/src/cluster-2.6.8.1/gfs-kernel/src/dlm/lock.c
lock_dlm: assertion: "!error"
lock_dlm: time = 80864198
proj_lv: error=-22 num=2,1a lkf=10000 flags=84
------------[ cut here ]------------
kernel BUG at /usr/src/cluster-2.6.8.1/gfs-kernel/src/dlm/lock.c:353!
invalid operand: 0000 [#1]
Modules linked in: ipv6 evdev pcspkr psmouse sworks_agp agpgart ohci_hcd
usbcore tg3 firmware_class lock_dlm dlm cman gfs lock_harness dm_mod
qla2300 qla2xxx scsi_transport_fc sg sr_mod sd_mod scsi_mod ide_cd cdrom
genrtc ext3 jbd mbcache ide_generic via82cxxx trm290 triflex slc90e66
sis5513 siimage serverworks sc1200 rz1000 piix pdc202xx_old pdc202xx_new
opti621 ns87415 hpt366 ide_disk hpt34x generic cy82c693 cs5530 cs5520
cmd64x atiixp amd74xx alim15x3 aec62xx ide_core unix
CPU: 0
EIP: 0060:[<f89e6b46>] Tainted: GF
EFLAGS: 00010286 (2.6.8.1)
EIP is at do_dlm_unlock+0x106/0x120 [lock_dlm]
eax: 00000001 ebx: ffffffea ecx: c02b4870 edx: 000053ec
esi: f432aa00 edi: f8b301c0 ebp: f43ce000 esp: f43cfedc
ds: 007b es: 007b ss: 0068
Process gfs_glockd (pid: 2174, threadinfo=f43ce000 task=f43b4dd0)
Stack: f89ed876 f431cde0 ffffffea 00000002 0000001a 00000000 00010000
00000084
f8ba8000 f8ba8000 f89e6eef f432aa00 f8b01718 f432aa00 00000003
f431dbd0
f8af51f9 f8ba8000 f432aa00 00000003 00000000 f8bb83f4 f8b301c0
00000000
Call Trace:
[<f89e6eef>] lm_dlm_unlock+0x1f/0x30 [lock_dlm]
[<f8b01718>] gfs_lm_unlock+0x38/0x60 [gfs]
[<f8af51f9>] gfs_glock_drop_th+0x69/0x1a0 [gfs]
[<f8af4588>] rq_demote+0x98/0xb0 [gfs]
[<f8af468c>] run_queue+0xac/0xe0 [gfs]
[<f8af6ed4>] demote_ok+0x74/0x80 [gfs]
[<f8af700d>] gfs_reclaim_glock+0x7d/0x130 [gfs]
[<f8ae80ca>] gfs_glockd+0x10a/0x120 [gfs]
[<c0115950>] default_wake_function+0x0/0x20
[<c0105d72>] ret_from_fork+0x6/0x14
[<c0115950>] default_wake_function+0x0/0x20
[<f8ae7fc0>] gfs_glockd+0x0/0x120 [gfs]
[<c01042ad>] kernel_thread_helper+0x5/0x18
Code: 0f 0b 61 01 e0 c8 9e f8 c7 04 24 20 c9 9e f8 e8 16 18 73 c7
--
More information about the Linux-cluster
mailing list