[Linux-cluster] GFS 6.0 crashing x86_64 machine

micah nerren mnerren at paracel.com
Wed Aug 4 01:12:01 UTC 2004


Hi,

On Tue, 2004-08-03 at 07:40, Michael Conrad Tadpol Tilstra wrote:
> On Mon, Aug 02, 2004 at 01:59:55PM -0700, micah nerren wrote:
> [snip]
> > I hope this helps!!
> [snip]
> 
> yeah, looks like a stack overflow.
> here's a patch that I put in for 6.0.  (patch works on 6.0.0-7)
> 

I applied the patch to 6.0.0-7, rebuild the entire package, and I still
get the crash when I mount. Below is the text of the crash.

Any ideas? I double and triple checked that the patch was indeed applied
to the code I was building and it was.

Thanks,

Micah

///////////////

Unable to handle kernel NULL pointer dereference at virtual address
0000000000000000
 printing rip:
ffffffff8024a875
PML4 77caf067 PGD 7a78f067 PMD 0 
Oops: 0002
CPU 0 
Pid: 4056, comm: mount Not tainted
RIP: 0010:[<ffffffff8024a875>]{net_rx_action+213}
RSP: 0018:0000010077d93048  EFLAGS: 00010046
RAX: 0000000000000000 RBX: ffffffff806077e8 RCX: ffffffff80607988
RDX: ffffffff806077e8 RSI: 0000010077d68800 RDI: ffffffff806077d0
RBP: ffffffff80607668 R08: 00000000824c6a9c R09: 00000000004c824c
R10: 000000000100007f R11: 0000000000000000 R12: ffffffff806077e8
R13: ffffffff806077c0 R14: 000000000000ed06 R15: 0000000000000000
FS:  0000002a955764c0(0000) GS:ffffffff805d9840(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000000000101000 CR4: 00000000000006e0

Call Trace: [<ffffffff8024a84d>]{net_rx_action+173} 
       [<ffffffff8012a72e>]{do_softirq+174}
[<ffffffff80267cf0>]{ip_finish_output2+0} 
       [<ffffffff80267cc0>]{dst_output+0}
[<ffffffff802b5915>]{do_softirq_thunk+53} 
       [<ffffffff802533a7>]{.text.lock.netfilter+165}
[<ffffffff80267cc0>]{dst_output+0} 
       [<ffffffff80265fbb>]{ip_queue_xmit+1019}
[<ffffffff80262ee0>]{ip_rcv_finish+0} 
       [<ffffffff802630f0>]{ip_rcv_finish+528}
[<ffffffff80252e51>]{nf_hook_slow+305} 
       [<ffffffff80262ee0>]{ip_rcv_finish+0}
[<ffffffff80277faf>]{tcp_transmit_skb+1295} 
       [<ffffffff80278ac6>]{tcp_write_xmit+198}
[<ffffffff8026de83>]{tcp_sendmsg+4051} 
       [<ffffffff8028e795>]{inet_sendmsg+69}
[<ffffffff802407ae>]{sock_sendmsg+142} 
       [<ffffffffa013c4b1>]{:lock_gulm:do_tfer+369}
[<ffffffffa013ebd4>]{:lock_gulm:.rodata.str1.1+467} 
       [<ffffffffa013c595>]{:lock_gulm:xdr_send+37}
[<ffffffffa013b498>]{:lock_gulm:xdr_enc_flush+56} 
       [<ffffffffa013951d>]{:lock_gulm:lg_lock_login+301} 
       [<ffffffffa0135ff9>]{:lock_gulm:lt_login+57}
[<ffffffffa0132164>]{:lock_gulm:gulm_core_login_reply+164} 
       [<ffffffffa01426a0>]{:lock_gulm:core_cb+0}
[<ffffffffa01380eb>]{:lock_gulm:lg_core_handle_messages+315} 
       [<ffffffffa0138713>]{:lock_gulm:lg_core_login+323} 
       [<ffffffffa013253a>]{:lock_gulm:cm_login+122}
[<ffffffffa0132bde>]{:lock_gulm:start_gulm_threads+174} 
       [<ffffffffa0132f08>]{:lock_gulm:gulm_mount+616}
[<ffffffffa015d940>]{:gfs:gfs_glock_cb+0} 
       [<ffffffffa01313e3>]{:lock_harness:lm_mount_Rsmp_ad6c5c21+355} 
       [<ffffffffa015d940>]{:gfs:gfs_glock_cb+0}
[<ffffffffa0162ff9>]{:gfs:gfs_mount_lockproto+313} 
       [<ffffffff8013d8d2>]{do_anonymous_page+1234}
[<ffffffff8013d94f>]{do_no_page+95} 
       [<ffffffff801a5103>]{do_page_fault+627}
[<ffffffff801109d6>]{error_exit+0} 
       [<ffffffff80184ce5>]{create_elf_tables+261}
[<ffffffff801547bc>]{__alloc_pages+156} 
       [<ffffffffa014e37b>]{:gfs:gfs_read_super+1307}
[<ffffffffa0182b00>]{:gfs:gfs_fs_type+0} 
       [<ffffffff80164c0c>]{get_sb_bdev+588}
[<ffffffffa0182b00>]{:gfs:gfs_fs_type+0} 
       [<ffffffff80164ec9>]{do_kern_mount+121}
[<ffffffff8017baa1>]{do_add_mount+161} 
       [<ffffffff8017bdb9>]{do_mount+345}
[<ffffffff80154b40>]{__get_free_pages+16} 
       [<ffffffff8017c1d5>]{sys_mount+197}
[<ffffffff80110177>]{system_call+119} 
       
Process mount (pid: 4056, stackpage=10077d93000)
Stack: 0000010077d93048 0000000000000018 ffffffff8024a84d
0000012a80445d20 
       0000000000000001 ffffffff80606c60 0000000000000000
000000000000000a 
       0000000000000000 0000000000000002 ffffffff8012a72e
ffffffff80267cf0 
       0000000000000246 0000000000000000 0000000000000003
ffffffff80445d20 
       ffffffff80267cc0 0000000000000000 ffffffff802b5915
0000000000000043 
       0000000000000006 00000100796a109e 000001007c6231c0
0000000000000000 
       0000000000000000 ffffffff8049c648 0000000000000000
ffffffff806077c0 
       ffffffff802533a7 ffffffff80267cc0 ffffffff80445d20
0000000000000002 
       000001007c6231c0 ffffffff805abcd0 00000100796a10ac
000001007c6231c0 
       0000010077d68800 0000000000000000 0000010077d68800
000001007c623228 
Call Trace: [<ffffffff8024a84d>]{net_rx_action+173} 
       [<ffffffff8012a72e>]{do_softirq+174}
[<ffffffff80267cf0>]{ip_finish_output2+0} 
       [<ffffffff80267cc0>]{dst_output+0}
[<ffffffff802b5915>]{do_softirq_thunk+53} 
       [<ffffffff802533a7>]{.text.lock.netfilter+165}
[<ffffffff80267cc0>]{dst_output+0} 
       [<ffffffff80265fbb>]{ip_queue_xmit+1019}
[<ffffffff80262ee0>]{ip_rcv_finish+0} 
       [<ffffffff802630f0>]{ip_rcv_finish+528}
[<ffffffff80252e51>]{nf_hook_slow+305} 
       [<ffffffff80262ee0>]{ip_rcv_finish+0}
[<ffffffff80277faf>]{tcp_transmit_skb+1295} 
       [<ffffffff80278ac6>]{tcp_write_xmit+198}
[<ffffffff8026de83>]{tcp_sendmsg+4051} 
       [<ffffffff8028e795>]{inet_sendmsg+69}
[<ffffffff802407ae>]{sock_sendmsg+142} 
       [<ffffffffa013c4b1>]{:lock_gulm:do_tfer+369}
[<ffffffffa013ebd4>]{:lock_gulm:.rodata.str1.1+467} 
       [<ffffffffa013c595>]{:lock_gulm:xdr_send+37}
[<ffffffffa013b498>]{:lock_gulm:xdr_enc_flush+56} 
       [<ffffffffa013951d>]{:lock_gulm:lg_lock_login+301} 
       [<ffffffffa0135ff9>]{:lock_gulm:lt_login+57}
[<ffffffffa0132164>]{:lock_gulm:gulm_core_login_reply+164} 
       [<ffffffffa01426a0>]{:lock_gulm:core_cb+0}
[<ffffffffa01380eb>]{:lock_gulm:lg_core_handle_messages+315} 
       [<ffffffffa0138713>]{:lock_gulm:lg_core_login+323} 
       [<ffffffffa013253a>]{:lock_gulm:cm_login+122}
[<ffffffffa0132bde>]{:lock_gulm:start_gulm_threads+174} 
       [<ffffffffa0132f08>]{:lock_gulm:gulm_mount+616}
[<ffffffffa015d940>]{:gfs:gfs_glock_cb+0} 
       [<ffffffffa01313e3>]{:lock_harness:lm_mount_Rsmp_ad6c5c21+355} 
       [<ffffffffa015d940>]{:gfs:gfs_glock_cb+0}
[<ffffffffa0162ff9>]{:gfs:gfs_mount_lockproto+313} 
       [<ffffffff8013d8d2>]{do_anonymous_page+1234}
[<ffffffff8013d94f>]{do_no_page+95} 
       [<ffffffff801a5103>]{do_page_fault+627}
[<ffffffff801109d6>]{error_exit+0} 
       [<ffffffff80184ce5>]{create_elf_tables+261}
[<ffffffff801547bc>]{__alloc_pages+156} 
       [<ffffffffa014e37b>]{:gfs:gfs_read_super+1307}
[<ffffffffa0182b00>]{:gfs:gfs_fs_type+0} 
       [<ffffffff80164c0c>]{get_sb_bdev+588}
[<ffffffffa0182b00>]{:gfs:gfs_fs_type+0} 
       [<ffffffff80164ec9>]{do_kern_mount+121}
[<ffffffff8017baa1>]{do_add_mount+161} 
       [<ffffffff8017bdb9>]{do_mount+345}
[<ffffffff80154b40>]{__get_free_pages+16} 
       [<ffffffff8017c1d5>]{sys_mount+197}
[<ffffffff80110177>]{system_call+119} 
       

Code: 48 89 18 48 89 43 08 8b 85 90 01 00 00 85 c0 79 08 03 85 94

Kernel panic: Fatal exception
In interrupt handler - not syncing
 
NMI Watchdog detected LOCKUP on CPU0, eip ffffffff8011a948, registers:
CPU 0 
Pid: 4056, comm: mount Not tainted
RIP: 0010:[<ffffffff8011a948>]{smp_call_function+120}
RSP: 0018:0000010077d92d48  EFLAGS: 00000097
RAX: 0000000000000000 RBX: ffffffff802cfc1a RCX: 0000000000000000
RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffffffff8011a970
RBP: 0000000000000002 R08: 0000000000000005 R09: 0000000000000000
R10: 0000000000000000 R11: 00000000000003c8 R12: ffffffff802da247
R13: 0000000000000000 R14: 0000000000000002 R15: 0000010077d92f98
FS:  0000002a955764c0(0000) GS:ffffffff805d9840(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000000000101000 CR4: 00000000000006e0

Call Trace:  <EOE> [<ffffffff8011a970>]{stop_this_cpu+0} 
       [<ffffffff8011a9b9>]{smp_send_stop+25}
[<ffffffff80123d98>]{panic+312} 
       [<ffffffff8011129a>]{show_trace+666}
[<ffffffff801113bd>]{show_stack+205} 
       [<ffffffff80111500>]{show_registers+304}
[<ffffffff801116ac>]{die+268} 
       [<ffffffff801a526d>]{do_page_fault+989}
[<ffffffff80252e51>]{nf_hook_slow+305} 
       [<ffffffff80262ee0>]{ip_rcv_finish+0}
[<ffffffff802630f0>]{ip_rcv_finish+528} 
       [<ffffffff801109d6>]{error_exit+0}
[<ffffffff8024a875>]{net_rx_action+213} 
       [<ffffffff8024a84d>]{net_rx_action+173}
[<ffffffff8012a72e>]{do_softirq+174} 
       [<ffffffff80267cf0>]{ip_finish_output2+0}
[<ffffffff80267cc0>]{dst_output+0} 
       [<ffffffff802b5915>]{do_softirq_thunk+53}
[<ffffffff802533a7>]{.text.lock.netfilter+165} 
       [<ffffffff80267cc0>]{dst_output+0}
[<ffffffff80265fbb>]{ip_queue_xmit+1019} 
       [<ffffffff80262ee0>]{ip_rcv_finish+0}
[<ffffffff802630f0>]{ip_rcv_finish+528} 
       [<ffffffff80252e51>]{nf_hook_slow+305}
[<ffffffff80262ee0>]{ip_rcv_finish+0} 
       [<ffffffff80277faf>]{tcp_transmit_skb+1295}
[<ffffffff80278ac6>]{tcp_write_xmit+198} 
       [<ffffffff8026de83>]{tcp_sendmsg+4051}
[<ffffffff8028e795>]{inet_sendmsg+69} 
       [<ffffffff802407ae>]{sock_sendmsg+142}
[<ffffffffa013c4b1>]{:lock_gulm:do_tfer+369} 
       [<ffffffffa013ebd4>]{:lock_gulm:.rodata.str1.1+467} 
       [<ffffffffa013c595>]{:lock_gulm:xdr_send+37}
[<ffffffffa013b498>]{:lock_gulm:xdr_enc_flush+56} 
       [<ffffffffa013951d>]{:lock_gulm:lg_lock_login+301} 
       [<ffffffffa0135ff9>]{:lock_gulm:lt_login+57}
[<ffffffffa0132164>]{:lock_gulm:gulm_core_login_reply+164} 
       [<ffffffffa01426a0>]{:lock_gulm:core_cb+0}
[<ffffffffa01380eb>]{:lock_gulm:lg_core_handle_messages+315} 
       [<ffffffffa0138713>]{:lock_gulm:lg_core_login+323} 
       [<ffffffffa013253a>]{:lock_gulm:cm_login+122}
[<ffffffffa0132bde>]{:lock_gulm:start_gulm_threads+174} 
       [<ffffffffa0132f08>]{:lock_gulm:gulm_mount+616}
[<ffffffffa015d940>]{:gfs:gfs_glock_cb+0} 
       [<ffffffffa01313e3>]{:lock_harness:lm_mount_Rsmp_ad6c5c21+355} 
       [<ffffffffa015d940>]{:gfs:gfs_glock_cb+0}
[<ffffffffa0162ff9>]{:gfs:gfs_mount_lockproto+313} 
       [<ffffffff8013d8d2>]{do_anonymous_page+1234}
[<ffffffff8013d94f>]{do_no_page+95} 
       [<ffffffff801a5103>]{do_page_fault+627}
[<ffffffff801109d6>]{error_exit+0} 
       [<ffffffff80184ce5>]{create_elf_tables+261}
[<ffffffff801547bc>]{__alloc_pages+156} 
       [<ffffffffa014e37b>]{:gfs:gfs_read_super+1307}
[<ffffffffa0182b00>]{:gfs:gfs_fs_type+0} 
       [<ffffffff80164c0c>]{get_sb_bdev+588}
[<ffffffffa0182b00>]{:gfs:gfs_fs_type+0} 
       [<ffffffff80164ec9>]{do_kern_mount+121}
[<ffffffff8017baa1>]{do_add_mount+161} 
       [<ffffffff8017bdb9>]{do_mount+345}
[<ffffffff80154b40>]{__get_free_pages+16} 
       [<ffffffff8017c1d5>]{sys_mount+197}
[<ffffffff80110177>]{system_call+119} 
       
Process mount (pid: 4056, stackpage=10077d93000)
Stack: 0000010077d92d48 0000000000000018 0000000000100000
0000000000000000 
       00000100079c4c80 ffffffff803e89a0 0000000000000000
00000100000fdea0 
       ffffffff803e8d00 00000100079bf000 00000100079d6400
0000000000000042 
       00000100079de280 ffffff0000000000 000000fffffff000
0000000000000000 
       00000100079d7a80 0000000000000000 0000000000000000
0000000000000000 
       0000000000000000 0000000000000000 0000000000000000
0000000000000000 
       0000010077d92d48 0000000000000000 00000000006d9994
0000000000000003 
       0000000000000000 0000000000000000 0000000100000000
ffffffffffffffff 
       ffffffffffffffff ffffffffffffffff ffffffffffffffff
ffffffffffffffff 
       ffffffffffffffff ffffffffffffffff ffffffffffffffff
ffffffffffffffff 
Call Trace:  <EOE> [<ffffffff8011a970>]{stop_this_cpu+0} 
       [<ffffffff8011a9b9>]{smp_send_stop+25}
[<ffffffff80123d98>]{panic+312} 
       [<ffffffff8011129a>]{show_trace+666}
[<ffffffff801113bd>]{show_stack+205} 
       [<ffffffff80111500>]{show_registers+304}
[<ffffffff801116ac>]{die+268} 
       [<ffffffff801a526d>]{do_page_fault+989}
[<ffffffff80252e51>]{nf_hook_slow+305} 
       [<ffffffff80262ee0>]{ip_rcv_finish+0}
[<ffffffff802630f0>]{ip_rcv_finish+528} 
       [<ffffffff801109d6>]{error_exit+0}
[<ffffffff8024a875>]{net_rx_action+213} 
       [<ffffffff8024a84d>]{net_rx_action+173}
[<ffffffff8012a72e>]{do_softirq+174} 
       [<ffffffff80267cf0>]{ip_finish_output2+0}
[<ffffffff80267cc0>]{dst_output+0} 
       [<ffffffff802b5915>]{do_softirq_thunk+53}
[<ffffffff802533a7>]{.text.lock.netfilter+165} 
       [<ffffffff80267cc0>]{dst_output+0}
[<ffffffff80265fbb>]{ip_queue_xmit+1019} 
       [<ffffffff80262ee0>]{ip_rcv_finish+0}
[<ffffffff802630f0>]{ip_rcv_finish+528} 
       [<ffffffff80252e51>]{nf_hook_slow+305}
[<ffffffff80262ee0>]{ip_rcv_finish+0} 
       [<ffffffff80277faf>]{tcp_transmit_skb+1295}
[<ffffffff80278ac6>]{tcp_write_xmit+198} 
       [<ffffffff8026de83>]{tcp_sendmsg+4051}
[<ffffffff8028e795>]{inet_sendmsg+69} 
       [<ffffffff802407ae>]{sock_sendmsg+142}
[<ffffffffa013c4b1>]{:lock_gulm:do_tfer+369} 
       [<ffffffffa013ebd4>]{:lock_gulm:.rodata.str1.1+467} 
       [<ffffffffa013c595>]{:lock_gulm:xdr_send+37}
[<ffffffffa013b498>]{:lock_gulm:xdr_enc_flush+56} 
       [<ffffffffa013951d>]{:lock_gulm:lg_lock_login+301} 
       [<ffffffffa0135ff9>]{:lock_gulm:lt_login+57}
[<ffffffffa0132164>]{:lock_gulm:gulm_core_login_reply+164} 
       [<ffffffffa01426a0>]{:lock_gulm:core_cb+0}
[<ffffffffa01380eb>]{:lock_gulm:lg_core_handle_messages+315} 
       [<ffffffffa0138713>]{:lock_gulm:lg_core_login+323} 
       [<ffffffffa013253a>]{:lock_gulm:cm_login+122}
[<ffffffffa0132bde>]{:lock_gulm:start_gulm_threads+174} 
       [<ffffffffa0132f08>]{:lock_gulm:gulm_mount+616}
[<ffffffffa015d940>]{:gfs:gfs_glock_cb+0} 
       [<ffffffffa01313e3>]{:lock_harness:lm_mount_Rsmp_ad6c5c21+355} 
       [<ffffffffa015d940>]{:gfs:gfs_glock_cb+0}
[<ffffffffa0162ff9>]{:gfs:gfs_mount_lockproto+313} 
       [<ffffffff8013d8d2>]{do_anonymous_page+1234}
[<ffffffff8013d94f>]{do_no_page+95} 
       [<ffffffff801a5103>]{do_page_fault+627}
[<ffffffff801109d6>]{error_exit+0} 
       [<ffffffff80184ce5>]{create_elf_tables+261}
[<ffffffff801547bc>]{__alloc_pages+156} 
       [<ffffffffa014e37b>]{:gfs:gfs_read_super+1307}
[<ffffffffa0182b00>]{:gfs:gfs_fs_type+0} 
       [<ffffffff80164c0c>]{get_sb_bdev+588}
[<ffffffffa0182b00>]{:gfs:gfs_fs_type+0} 
       [<ffffffff80164ec9>]{do_kern_mount+121}
[<ffffffff8017baa1>]{do_add_mount+161} 
       [<ffffffff8017bdb9>]{do_mount+345}
[<ffffffff80154b40>]{__get_free_pages+16} 
       [<ffffffff8017c1d5>]{sys_mount+197}
[<ffffffff80110177>]{system_call+119} 
       

Code: 39 d0 75 f8 85 c9 74 10 8b 44 24 14 39 d0 74 08 8b 44 24 14

console shuts up ...
 NM I Watchdog detected LOCKUP on CPU1, eip ffffffff801a5419, registers:
  




More information about the Linux-cluster mailing list