[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[Linux-cluster] GFS2 crashes after upgrade RHEL 6.4



Hi all,
We're seeing gfs2 crashes since we've upgraded to RHEL 6.4. The traceback is:

[2013-03-13 08:48:24]BUG: unable to handle kernel NULL pointer dereference at 0000000000000060^M [2013-03-13 08:48:24]IP: [<ffffffffa04d66ef>] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2]^M
[2013-03-13 08:48:24]PGD 0 ^M
[2013-03-13 08:48:24]Oops: 0002 [#1] SMP ^M
[2013-03-13 08:48:24]last sysfs file: /sys/devices/pci0000:00/0000:00:06.0/0000:0b:00.0/0000:0c:09.0/0000:0d:00.1/host3/rport-3:0-4/target3:0:3/3:0:3:14/state^M
[2013-03-13 08:48:24]CPU 0 ^M
[2013-03-13 08:48:24]Modules linked in: autofs4 gfs2 dlm configfs sunrpc p4_clockmod freq_table speedstep_lib arpt_mangle arptable_filter arp_tables ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_recent ipt_LOG iptable_filter ip_tables nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 uinput hpwdt hpilo microcode iTCO_wdt iTCO_vendor_support i7300_edac edac_core bnx2 sg shpchp ext4 mbcache jbd2 dm_round_robin sd_mod crc_t10dif sr_mod cdrom qla2xxx scsi_transport_fc scsi_tgt pata_acpi ata_generic ata_piix hpsa cciss radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core dm_multipath dm_mirror dm_region_hash dm_log dm_mod [last unloaded: mperf]^M
[2013-03-13 08:48:24]^M
[2013-03-13 08:48:24]Pid: 9888, comm: smbd Not tainted 2.6.32-358.0.1.el6.x86_64 #1 HP ProLiant DL580 G5^M [2013-03-13 08:48:24]RIP: 0010:[<ffffffffa04d66ef>] [<ffffffffa04d66ef>] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2]^M
[2013-03-13 08:48:24]RSP: 0018:ffff880dce0f9c98  EFLAGS: 00010287^M
[2013-03-13 08:48:24]RAX: ffff880ff78999a8 RBX: ffff880dae61d7c0 RCX: 00000000006c0762^M [2013-03-13 08:48:24]RDX: 00000000006c0762 RSI: 00000000006c075b RDI: ffff88100b2b6440^M [2013-03-13 08:48:24]RBP: ffff880dce0f9d58 R08: 1050000000000000 R09: f213f3d57bbf820a^M [2013-03-13 08:48:24]R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000001000^M [2013-03-13 08:48:24]R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000000^M [2013-03-13 08:48:24]FS: 00007f3ac254c7c0(0000) GS:ffff880061a00000(0000) knlGS:0000000000000000^M
[2013-03-13 08:48:24]CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033^M
[2013-03-13 08:48:24]CR2: 0000000000000060 CR3: 0000000dce153000 CR4: 00000000000007f0^M [2013-03-13 08:48:24]DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000^M [2013-03-13 08:48:24]DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400^M [2013-03-13 08:48:24]Process smbd (pid: 9888, threadinfo ffff880dce0f8000, task ffff88100b0b6ae0)^M
[2013-03-13 08:48:24]Stack:^M
[2013-03-13 08:48:24] ffff880dce0f9e08 000000000000000a ffff880dce0f9cc8 ffffffff81096c8f^M [2013-03-13 08:48:24]<d> ffff880dce0f9dd8 00000007b078eaf8 ffff880dce0f9cd8 ffff88100b2b6000^M [2013-03-13 08:48:24]<d> ffff880dce0f9d28 ffffffffa04be2a8 ffff880ff78999a8 0000000000000000^M
[2013-03-13 08:48:24]Call Trace:^M
[2013-03-13 08:48:24] [<ffffffff81096c8f>] ? wake_up_bit+0x2f/0x40^M
[2013-03-13 08:48:24] [<ffffffffa04be2a8>] ? do_promote+0x208/0x330 [gfs2]^M
[2013-03-13 08:48:24] [<ffffffffa04b106e>] gfs2_setattr_size+0xce/0x210 [gfs2]^M
[2013-03-13 08:48:24] [<ffffffffa04cd534>] gfs2_setattr+0x214/0x330 [gfs2]^M
[2013-03-13 08:48:24] [<ffffffffa04cd366>] ? gfs2_setattr+0x46/0x330 [gfs2]^M
[2013-03-13 08:48:24] [<ffffffff8119e768>] notify_change+0x168/0x340^M
[2013-03-13 08:48:24] [<ffffffff8117f1e4>] do_truncate+0x64/0xa0^M
[2013-03-13 08:48:24] [<ffffffff8117f520>] sys_ftruncate+0x120/0x130^M
[2013-03-13 08:48:24] [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b^M
[2013-03-13 08:48:24]Code: 0f 84 c1 fc ff ff e9 41 fb ff ff 48 8b 4d a0 48 8b b1 10 03 00 00 48 8b bd 78 ff ff ff ba 01 00 00 00 e8 75 d6 ff ff 48 89 45 90 <49> 89 45 60 c7 45 9c 01 00 00 00 48 8b 45 90 e9 01 fb ff ff 48 ^M [2013-03-13 08:48:24]RIP [<ffffffffa04d66ef>] gfs2_inplace_reserve+0x54f/0x7e0 [gfs2]^M
[2013-03-13 08:48:24] RSP <ffff880dce0f9c98>^M
[2013-03-13 08:48:24]CR2: 0000000000000060^M


We've seen this from both svn and smbd now, and on a couple of different nodes in our cluster. We brought the cluster down last night and ran gfs2.fsck on all filesystems, but the problem persists.

Has anyone seen this before? Is there a workaround or should we drop back to the previous kernel?

-- scooter



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]