[Crash-utility] Crash faults when determining panic task
Dave Anderson
anderson at redhat.com
Fri Sep 30 14:59:46 UTC 2011
----- Original Message -----
>
> What's unexplainable here is the dump of the note information:
>
> > size_note: 1780
> > num_prstatus_notes: 1
> > notes_buf: 2cc4000
> > notes[0]: 2cc4000
>
> The determination of the number of ELF nt_prstatus notes is based
> upon the contents of the kdump_sub_header, where "size_note" describes
> a single buffer in the dumpfile that contains an array of nt_prstatus
> notes. Each note consists of a small Elf64_Nhdr header, a name string,
> and a register dump. Here's an example of one taken from an ELF-format
> kdump:
>
> Elf64_Nhdr:
> n_namesz: 5 ("CORE")
> n_descsz: 336
> n_type: 1 (NT_PRSTATUS)
> 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000
> 000000000000544d 0000000000000000
> 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000
> 0000000000000001 00007fffb894e76f
> 00007fffb894e170 ffff88012969ebf0
> ffff880128541f88 ffff880108870a00
> 0000000000000000 0000000000000000
> ffffffff8184f8f0 0000000000000000
> ffff880108870ab0 0000000000000003
> 0000000000000004 ffffffff81ad7fd0
> ffff8801280e44b0 ffffffffffffffff
> ffffffff8108d378 0000000000000010
> 0000000000010202 ffff88012854de68
> 0000000000000018 00007fa3283af700
> 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000
>
> So rounded up, each note is roughly ~350 bytes. So, while a "size_note"
> of 1780 bytes wouldn't be large enough to contain the notes for 16 cpus,
> it would seem to contain more than 1 note. (???) But the note-gathering
> code was only able to come up a "num_prstatus_notes" of 1.
>
> It would interesting to find out what happened in the x86_process_elf_notes()
> function.
Digging into this a bit more -- the array of notes in the dumpfile
also includes the VMCOREINFO note (n_type 0), which is roughly ~1400
bytes in length. So given that all of the notes consume 1780 bytes in
Joe's dump, it looks like there is one NT_PRSTATUS note and one VMCOREINFO
note.
Joe, do you know if the non-crashing cpus were in some kind of
bizarre state such that they would not respond to the shutdown NMI?
I suppose in that case, there would be only the one NT_PRSTATUS
note for the crashing cpu (plus the VMCOREINFO note).
In any case, so far I've got two patches queued to help address
the two segmentation violations generated by a scenario such as
this. First Joe's patch:
--- x86_64.c 28 Sep 2011 18:09:54 -0000 1.187
+++ x86_64.c 29 Sep 2011 19:17:09 -0000 1.188
@@ -4181,7 +4181,7 @@
goto skip_stage;
}
}
- } else if (ELF_NOTES_VALID()) {
+ } else if (ELF_NOTES_VALID() && bt->machdep) {
user_regs = bt->machdep;
ur_rip = ULONG(user_regs +
OFFSET(user_regs_struct_rip));
And then this preventative measure to prevent a bogus ELF
note pointer being passed back:
--- diskdump.c 20 Sep 2011 20:41:14 -0000 1.38
+++ diskdump.c 30 Sep 2011 14:55:11 -0000
@@ -1467,6 +1467,9 @@
void *
diskdump_get_prstatus_percpu(int cpu)
{
+ if ((cpu < 0) || (cpu >= dd->num_prstatus_notes))
+ return NULL;
+
return dd->nt_prstatus_percpu[cpu];
}
Dave
More information about the Crash-utility
mailing list