[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Kernel panic'd on AMD64 and I got a dump! (Was: How do you fix a Kernel Panic?)




Well, I built my own 'netconsole' module from the included sources, and it worked fine. I don't know why Red Hat doesn't provide it already built for us.


I updated my open ticket with the dump info. Here's the info I got from both netconsole and syslog. Anyone else with AMD64 systems seeing this (this is 2.4.21-15.EL)?

netconsole:

Unable to handle kernel paging request at virtual address 000000ffb85117e0
 printing rip:
ffffffff801106da
PML4 0
Oops: 0002
CPU 1
Pid: 0, comm: swapper Tainted: GF
RIP: 0010:[<ffffffff801106da>]{ret_from_intr+2}
RSP: 0018:0000010037f37fc0  EFLAGS: 00010057
RAX: 0000010037f37fc0 RBX: ffffffff8010de20 RCX: 00000000ffffffff
RDX: 000001001e68e000 RSI: 0000010144b7e000 RDI: 000001001e68ff18
RBP: ffffffff8010de20 R08: 00000000ffffffff R09: 00000100006b6340
R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
FS:  0000000049545970(0000) GS:ffffffff805d9800(0000) knlGS:0000000040021a80
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 000000ffb85117e0 CR3: 000000001e676000 CR4: 00000000000006e0

Call Trace: <EOI> NMI Watchdog detected LOCKUP on CPU1, eip ffffffff801a5359, registers:
CPU 1
Pid: 0, comm: swapper Tainted: GF
RIP: 0010:[<ffffffff801a5359>]{.text.lock.fault+7}
RSP: 0018:0000010037f37c08 EFLAGS: 00000086
RAX: 000000000000000f RBX: 0000000000000018 RCX: 0000000000000000
RDX: ffffffff80300380 RSI: ffffffff80300380 RDI: ffffffff80111205
RBP: 0000010037f37fc0 R08: ffffffff80300370 R09: 00000101fcfdc878
R10: 00000000003e0004 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000018 R14: 0000000000000000 R15: 0000010037f37ce8
FS: 0000000049545970(0000) GS:ffffffff805d9800(0000) knlGS:0000000040021a80
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000018 CR3: 000000001e676000 CR4: 00000000000006e0


Call Trace:
Process swapper (pid: 0, stackpage=1001e68f000)
Stack: 0000010037f37c08 0000000000000018 be00c000e600e900 91009300b900bc00
       640066008c008f00 370039005f006200 2d00300032003500 2100220023002500
       1100150021002100 5e58000004000a00 0100c68c3c00f311




syslog (seems to have gotten a little more at the end):


Unable to handle kernel paging request at virtual address 000000ffb85117e0
printing rip:
ffffffff801106da
PML4 0
Oops: 0002
CPU 1
Pid: 0, comm: swapper Tainted: GF
RIP: 0010:[<ffffffff801106da>]{ret_from_intr+2}
RSP: 0018:0000010037f37fc0 EFLAGS: 00010057
RAX: 0000010037f37fc0 RBX: ffffffff8010de20 RCX: 00000000ffffffff
RDX: 000001001e68e000 RSI: 0000010144b7e000 RDI: 000001001e68ff18
RBP: ffffffff8010de20 R08: 00000000ffffffff R09: 00000100006b6340
R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
FS: 0000000049545970(0000) GS:ffffffff805d9800(0000) knlGS:0000000040021a80
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 000000ffb85117e0 CR3: 000000001e676000 CR4: 00000000000006e0


Call Trace: <EOI> NMI Watchdog detected LOCKUP on CPU1, eip ffffffff801a5359, registers:
CPU 1
Pid: 0, comm: swapper Tainted: GF
RIP: 0010:[<ffffffff801a5359>]{.text.lock.fault+7}
RSP: 0018:0000010037f37c08 EFLAGS: 00000086
RAX: 000000000000000f RBX: 0000000000000018 RCX: 0000000000000000
RDX: ffffffff80300380 RSI: ffffffff80300380 RDI: ffffffff80111205
RBP: 0000010037f37fc0 R08: ffffffff80300370 R09: 00000101fcfdc878
R10: 00000000003e0004 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000018 R14: 0000000000000000 R15: 0000010037f37ce8
FS: 0000000049545970(0000) GS:ffffffff805d9800(0000) knlGS:0000000040021a80
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000018 CR3: 000000001e676000 CR4: 00000000000006e0


Call Trace:
Process swapper (pid: 0, stackpage=1001e68f000)
Stack: 0000010037f37c08 0000000000000018 be00c000e600e900 91009300b900bc00
640066008c008f00 370039005f006200 2d00300032003500 2100220023002500
1100150021002100 5e58000004000a00 0100c68c3c00f311 000b0000a6a7ad10
499ab20100ec000a 2e323130305f474d 802b00020147504a 4ae97e220b3a1200
00330b0000004750 34662d6d7364004b 3033313039386333 6263343030333339
3166333532313864 732d653433353165 96007100d814006d 33353737622d6d73
3430373731316136 3461343163633437 6365663165353432 73006d732d313833
2d6d732c01900107 3334646531306337 6563373036643534 6232653334313961
6464653064656435 5802d6f7006d732d 3736312d6d73c201 3131333261653662
6366323461393432 6637323535643066 6d732d3463383833 7358022003377e01
Call Trace:


Code: f3 90 7e f5 e9 c8 fd ff ff 90 90 90 90 90 90 90 90 90 90 90

console shuts up ...



Don



Don MacAskill wrote:


FYI, I have built my own 'netconsole' module and it's passing messages fine from my AMD64 box to my logging box.


Hopefully it gets what we need when the crash happens.

I'm still not sure why Red Hat wouldn't include this module by default for this platform, but I opened a support request with them.

Don


Don MacAskill wrote:



For some reason, the ia64 and x86_64 kernels don't have netconsole compiled as a module, making it tough to netdump:


[root zeus configs]# pwd
/usr/src/linux-2.4.21-15.EL/configs
[root zeus configs]# grep NETCONSOLE *
kernel-2.4.21-athlon.config:CONFIG_NETCONSOLE=m
kernel-2.4.21-athlon-smp.config:CONFIG_NETCONSOLE=m
kernel-2.4.21-i386-BOOT.config:CONFIG_NETCONSOLE=m
kernel-2.4.21-i386.config:CONFIG_NETCONSOLE=m
kernel-2.4.21-i586.config:CONFIG_NETCONSOLE=m
kernel-2.4.21-i586-smp.config:CONFIG_NETCONSOLE=m
kernel-2.4.21-i686.config:CONFIG_NETCONSOLE=m
kernel-2.4.21-i686-hugemem.config:CONFIG_NETCONSOLE=m
kernel-2.4.21-i686-smp.config:CONFIG_NETCONSOLE=m
kernel-2.4.21-ia32e.config:# CONFIG_NETCONSOLE is not set
kernel-2.4.21-ia32e.config:# CONFIG_NETCONSOLE is not set
kernel-2.4.21-ia64.config:# CONFIG_NETCONSOLE is not set
kernel-2.4.21-ia64.config:# CONFIG_NETCONSOLE is not set
kernel-2.4.21-ppc64.config:CONFIG_NETCONSOLE=m
kernel-2.4.21-ppc64iseries.config:CONFIG_NETCONSOLE=m
kernel-2.4.21-ppc64pseries.config:CONFIG_NETCONSOLE=m
kernel-2.4.21-s390.config:CONFIG_NETCONSOLE=m
kernel-2.4.21-s390x.config:CONFIG_NETCONSOLE=m
kernel-2.4.21-x86_64.config:# CONFIG_NETCONSOLE is not set
kernel-2.4.21-x86_64.config:# CONFIG_NETCONSOLE is not set
kernel-2.4.21-x86_64-smp.config:# CONFIG_NETCONSOLE is not set
kernel-2.4.21-x86_64-smp.config:# CONFIG_NETCONSOLE is not set


I'm currently buliding the x86_64 module myself, I'll see how it goes, but I have consistent crashing with x86_64 WS every 8 days or so.


Anyone know why those platforms don't have NETCONSOLE by default? ppc64 does, so I don't think it's a 64bit thing.

Don



jmklinke rockwellcollins com wrote:




My 2 starting point suggestions would be:


1) enable kernel logging in the /etc/syslog.conf file.Sometimes it's
handier to send it to a file like /var/log/kernel instead of messages.

2) set up the netdump and netconsole services (see netdump and
netdump-server rpms) to a remote host so you can catch the dump (especially
if you aren't getting anywhere with the "Oops" message.



--Jon



David Clements <dclements mercury To: "Discussion of Red Hat Enterprise Linux 3 (Taroon)" com> <taroon-list redhat com> Sent by: cc: taroon-list-bounces Subject: How do you fix a Kernel Panic? @redhat.com 05/26/2004 10:42 AM Please respond to "Discussion of Red Hat Enterprise Linux 3 (Taroon)"






Anyone have some advice on how to debug a Kernel Panic?  All I have is a
cryptic message from the console.  Does Linux provide a dump file like
Solaris does?


Dave-- Taroon-list mailing list Taroon-list redhat com http://www.redhat.com/mailman/listinfo/taroon-list




-- Taroon-list mailing list Taroon-list redhat com http://www.redhat.com/mailman/listinfo/taroon-list




--
Taroon-list mailing list
Taroon-list redhat com
http://www.redhat.com/mailman/listinfo/taroon-list



-- Taroon-list mailing list Taroon-list redhat com http://www.redhat.com/mailman/listinfo/taroon-list



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]