[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[Linux-cluster] GFS2 panic on current release



Hi All,

A few weeks ago I discovered that I'd had an obsolete gfs2 kernel module loaded and removed it, thus bringing it up to the revision included in the current kernel. Was hoping that all was well, but then yesterday morning one of the nodes panicked as follows:

original: gfs2_rename+0x19d/0x63b [gfs2]
pid : 12810
lock type: 3 req lock state : 1
new: gfs2_rlist_alloc+0x5c/0x6a [gfs2]
pid: 12810
lock type: 3 req lock state : 1
 G:  s:EX n:3/33d0327 f:y t:EX d:EX/0 l:0 a:5 r:4
  H: s:EX f:H e:0 p:12810 [imap] gfs2_rename+0x19d/0x63b [gfs2]
  R: n:54330151 f:05 b:274/274 i:1121
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at fs/gfs2/glock.c:1074
invalid opcode: 0000 [1] SMP
last sysfs file: /devices/pci0000:00/0000:00:0a.0/0000:02:02.0/irq
CPU 1
Modules linked in: nfs fscache nfs_acl lock_dlm gfs2 dlm configfs lockd sunrpc ipv6 xfrm_nalgo crypto_api ipt_LOG xt_state ip_conntrack nfnetlink xt_tcpudp iptable_filter ip_tables x_tables 8021q dm_multipath scsi_dh video backlight sbs i2c_ec button battery asus_acpi acpi_memhotplug ac parport_pc lp parport i2c_amd756 k8temp ide_cd i2c_core hwmon sg amd_rng cdrom k8_edac pcspkr tg3 floppy edac_mc e1000 dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod qla2xxx scsi_transport_fc shpchp mptspi mptscsih mptbase scsi_transport_spi sd_mod scsi_mod raid1 ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Pid: 12810, comm: imap Not tainted 2.6.18-164.6.1.el5 #1
RIP: 0010:[<ffffffff8862a6df>] [<ffffffff8862a6df>] :gfs2:gfs2_glock_nq+0x231/0x273
RSP: 0018:ffff8101ba8d9868  EFLAGS: 00010292
RAX: 0000000000000000 RBX: ffff8101ba8d9cb0 RCX: 0000000000000461
RDX: ffff8101ffe27a98 RSI: ffffffff80309c28 RDI: ffffffff80309c20
RBP: ffff8101860b1340 R08: ffffffff80309c28 R09: 000000000000003f
R10: ffff8101ba8d9368 R11: 0000000000000000 R12: ffff8100e87ea590
R13: ffff8100e87ea590 R14: ffff8100ed24e000 R15: 0000000000000000
FS:  00002b18a78ac530(0000) GS:ffff810103901940(0000) knlGS:00000000acbfbb90
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00002b70cf5cf000 CR3: 00000001b4d4a000 CR4: 00000000000006e0
Process imap (pid: 12810, threadinfo ffff8101ba8d8000, task ffff8101ffe277e0)
Stack:  ffff8101860b1340 0000000000000001 ffff8100b3e1b000 ffff8100b3e1a0e8
 0000000000000000 ffffffff8862a74e 0000000000000038 ffff810184e88368
 0000000000000001 ffffffff800caa0b 0000000000000005 ffff810184e88368
Call Trace:
 [<ffffffff8862a74e>] :gfs2:gfs2_glock_nq_m+0x2d/0xf4
 [<ffffffff800caa0b>] __kzalloc+0x9/0x21
 [<ffffffff88622831>] :gfs2:do_strip+0x175/0x349
 [<ffffffff886217e2>] :gfs2:recursive_scan+0xf2/0x175
 [<ffffffff886218fe>] :gfs2:trunc_dealloc+0x99/0xe7
 [<ffffffff886226bc>] :gfs2:do_strip+0x0/0x349
 [<ffffffff80090000>] sched_exit+0xb4/0xb5
 [<ffffffff88638dda>] :gfs2:gfs2_delete_inode+0xdd/0x191
 [<ffffffff88638d43>] :gfs2:gfs2_delete_inode+0x46/0x191
 [<ffffffff88628e77>] :gfs2:gfs2_glock_schedule_for_reclaim+0x5d/0x9a
 [<ffffffff88638cfd>] :gfs2:gfs2_delete_inode+0x0/0x191
 [<ffffffff8002f48f>] generic_delete_inode+0xc6/0x143
 [<ffffffff8863d9a4>] :gfs2:gfs2_inplace_reserve_i+0x63b/0x691
 [<ffffffff886248c4>] :gfs2:gfs2_dirent_find_space+0x0/0x41
 [<ffffffff88623983>] :gfs2:gfs2_dirent_search+0x147/0x16e
 [<ffffffff886377c5>] :gfs2:gfs2_rename+0x3be/0x63b
 [<ffffffff88637506>] :gfs2:gfs2_rename+0xff/0x63b
 [<ffffffff8863754c>] :gfs2:gfs2_rename+0x145/0x63b
 [<ffffffff88637571>] :gfs2:gfs2_rename+0x16a/0x63b
 [<ffffffff886375a4>] :gfs2:gfs2_rename+0x19d/0x63b
 [<ffffffff88629e29>] :gfs2:gfs2_holder_uninit+0xd/0x1f
 [<ffffffff886385bf>] :gfs2:gfs2_permission+0xaf/0xd4
 [<ffffffff88633124>] :gfs2:gfs2_drevalidate+0x158/0x214
 [<ffffffff8000d902>] permission+0x81/0xc8
 [<ffffffff8002a7d9>] vfs_rename+0x2f4/0x471
 [<ffffffff80036c20>] sys_renameat+0x180/0x1eb
 [<ffffffff800b66f5>] audit_syscall_entry+0x180/0x1b3
 [<ffffffff8005d28d>] tracesys+0xd5/0xe0


Code: 0f 0b 68 f8 27 64 88 c2 32 04 be 01 00 00 00 4c 89 ef e8 df
RIP  [<ffffffff8862a6df>] :gfs2:gfs2_glock_nq+0x231/0x273
 RSP <ffff8101ba8d9868>
<0>Kernel panic - not syncing: Fatal exception
 Killed by signal 15.

It seems possible that there would be some filesystem damage from running the old code and I'm going to fsck this weekend, but wanted to post this in case it revealed an obvious problem to anyone. The "invalid opcode: 0000" makes me think we ended up executing code that was actually data, but beyond that I'm clueless.

Thanks,
Allen

--
Allen Belletti
allen isye gatech edu                             404-894-6221 Phone
Industrial and Systems Engineering                404-385-2988 Fax
Georgia Institute of Technology


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]