[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[Linux-cluster] GFS2 crash



Hi all,
Just had a crash on our 3 node RedHat Enterprise Linux 5.4 cluster that looks a lot like https://bugzilla.redhat.com/show_bug.cgi?id=520720. We're running kernel 2.6.18-164.11.1.el5. Here is the traceback:

[2010-03-03 19:18:27]Unable to handle kernel NULL pointer dereference at 0000000000000078 RIP: ^M
[2010-03-03 19:18:27] [<ffffffff88572766>] :gfs2:revoke_lo_add+0x1a/0x32^M
[2010-03-03 19:18:27]PGD 0 ^M
[2010-03-03 19:18:27]Oops: 0002 [1] SMP ^M
[2010-03-03 19:18:27]last sysfs file: /devices/pci0000:00/0000:00:06.0/0000:0b:00.0/0000:0c:09.0/0000:0d:00.0/host0/rport-0:0-4/target0:0:3/0:0:3:2/state^M
[2010-03-03 19:18:27]CPU 13 ^M
[2010-03-03 19:18:27]Modules linked in: ipt_MASQUERADE iptable_nat ip_nat bridge autofs4 hidp l2cap bluetooth lock_dlm gfs2(U) dlm configfs lockd sunrpc ip_conntrack_netbios_ns xt_state ip_conntrack nfnetlink xt_tcpudp ipt_REJECT iptable_filter ip_tables arpt_mangle arptable_filter arp_tables x_tables dm_round_robin dm_multipath scsi_dh video hwmon backlight sbs i2c_ec i2c_core button battery asus_acpi acpi_memhotplug ac parport_pc lp parport sg serio_raw pcspkr bnx2 ide_cd hpilo cdrom dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod qla2xxx scsi_transport_fc ata_piix libata shpchp cciss sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd^M [2010-03-03 19:18:28]Pid: 792, comm: kswapd0 Tainted: G 2.6.18-164.11.1.el5 #1^M [2010-03-03 19:18:28]RIP: 0010:[<ffffffff88572766>] [<ffffffff88572766>] :gfs2:revoke_lo_add+0x1a/0x32^M
[2010-03-03 19:18:28]RSP: 0018:ffff81082ef61ae8  EFLAGS: 00010282^M
[2010-03-03 19:18:28]RAX: 0000000000000000 RBX: ffff81072a4b3610 RCX: ffff8103a31d78a0^M [2010-03-03 19:18:28]RDX: ffff8107768b63f0 RSI: ffff8108172e17c0 RDI: ffff8108172e1000^M [2010-03-03 19:18:28]RBP: ffff8107768b63d0 R08: ffff81082fc7ef06 R09: ffff81082ef61b20^M [2010-03-03 19:18:28]R10: ffff810119d7cc18 R11: ffffffff8857274c R12: ffff8108172e1000^M [2010-03-03 19:18:29]R13: 0000000000000000 R14: ffff81072a4b3610 R15: ffff8108172e1000^M [2010-03-03 19:18:29]FS: 0000000000000000(0000) GS:ffff81082fc7edc0(0000) knlGS:0000000000000000^M
[2010-03-03 19:18:29]CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b^M
[2010-03-03 19:18:29]CR2: 0000000000000078 CR3: 00000006cdbe7000 CR4: 00000000000006e0^M [2010-03-03 19:18:29]Process kswapd0 (pid: 792, threadinfo ffff81082ef60000, task ffff81082f5407e0)^M [2010-03-03 19:18:29]Stack: ffffffff88573bfb 000000002ef61e10 ffff81072a4b3610 ffff81010962e028^M [2010-03-03 19:18:29] 0000000000000000 0000000000000000 ffffffff88574ee0 000000000000000e^M [2010-03-03 19:18:29] ffff81010962e028 0000000000413ac8 ffff81082ef61cf0 ffff8108172e1000^M
[2010-03-03 19:18:29]Call Trace:^M
[2010-03-03 19:18:29] [<ffffffff88573bfb>] :gfs2:gfs2_remove_from_journal+0x11a/0x12c^M [2010-03-03 19:18:29] [<ffffffff88574ee0>] :gfs2:gfs2_invalidatepage+0xea/0x151^M [2010-03-03 19:18:29] [<ffffffff88574c45>] :gfs2:gfs2_writepage_common+0x95/0xb1^M [2010-03-03 19:18:29] [<ffffffff88575129>] :gfs2:gfs2_jdata_writepage+0x2a/0xa0^M [2010-03-03 19:18:29] [<ffffffff800ca21c>] shrink_inactive_list+0x3fd/0x8d8^M
[2010-03-03 19:18:29] [<ffffffff8004819b>] __pagevec_release+0x19/0x22^M
[2010-03-03 19:18:29] [<ffffffff800c9cfe>] shrink_active_list+0x4b4/0x4c4^M
[2010-03-03 19:18:30] [<ffffffff80013007>] shrink_zone+0xf7/0x15d^M
[2010-03-03 19:18:30] [<ffffffff80057e41>] kswapd+0x323/0x46c^M
[2010-03-03 19:18:30] [<ffffffff800a00b7>] autoremove_wake_function+0x0/0x2e^M
[2010-03-03 19:18:30] [<ffffffff8009fe9f>] keventd_create_kthread+0x0/0xc4^M
[2010-03-03 19:18:30] [<ffffffff80057b1e>] kswapd+0x0/0x46c^M
[2010-03-03 19:18:30] [<ffffffff8009fe9f>] keventd_create_kthread+0x0/0xc4^M
[2010-03-03 19:18:30] [<ffffffff80032950>] kthread+0xfe/0x132^M
[2010-03-03 19:18:30] [<ffffffff8009cd34>] request_module+0x0/0x14d^M
[2010-03-03 19:18:30] [<ffffffff8005dfb1>] child_rip+0xa/0x11^M
[2010-03-03 19:18:30] [<ffffffff8009fe9f>] keventd_create_kthread+0x0/0xc4^M
[2010-03-03 19:18:30] [<ffffffff80032852>] kthread+0x0/0x132^M
[2010-03-03 19:18:30] [<ffffffff8005dfa7>] child_rip+0x0/0x11^M
[2010-03-03 19:18:30]^M
[2010-03-03 19:18:30]^M
[2010-03-03 19:18:30]Code: ff 40 78 c7 40 50 01 00 00 00 ff 87 94 07 00 00 48 89 d7 e9 ^M [2010-03-03 19:18:30]RIP [<ffffffff88572766>] :gfs2:revoke_lo_add+0x1a/0x32^M
[2010-03-03 19:18:30] RSP <ffff81082ef61ae8>^M
[2010-03-03 19:18:30]CR2: 0000000000000078^M
[2010-03-03 19:18:30] <0>Kernel panic - not syncing: Fatal exception^M

Since we're already running the latest 5.4 kernel, it's not clear what might be going on, here. There is a note in the bug about making sure the gfs2-kmod from 5.2 isn't still around. What version of gfs2-kmod is the old version, or should I just remove all instances of gfs2-kmod?

-- scooter



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]