[Crash-utility] Re:[RFC] Crash patch for DWARF CFI based unwind support

Rachita Kothiyal rachita at in.ibm.com
Mon Oct 23 11:00:49 UTC 2006


On Thu, Oct 19, 2006 at 05:15:32PM -0400, Dave Anderson wrote:
> 
> > There still are a couple of things which need to be done, viz
> > 1. Extend to obtaining unwind info from modules as well(currently
> >    doing only for the kernel)
> > 2. Currently reading the unwind info from eh_frame section only(ie
> >    __start_unwind to __end_unwind). Need to add facility to read from
> >    the .debug_frame(if .debug_frame is present in cases where .eh_frame
> >    is absent. Will have to read from the vmlinux if we want to read the
> >    .debug_frame info)
> 
> Hi Rachita,
> 
> I hope to be able to come up with a new crash version
> for you to continue working with by tomorrow, Monday at
> the latest.
> 
> Off the top of my head, here's what I've done with your
> initial patch:
> 
> 1. As Ben mentioned, it need to be made compilable for
>    other architectures.
> 2. Renamed unwind_x86_64.c into unwind_x86_32_64.c,
>    because the unwind code should be architecture
>    neutral with respect to x86 and x86_64.  It's currently
>    #ifdef'd to only be compile if X86_64, but when a
>    new "unwind_x86.h" file is ready to go, it can be
>    made usable by both arches.
> 3. Made it capable of reading .eh_frame data from the
>    vmlinux file if it is not in memory.
> 4. Made it capable of reading all of the module's unwind
>    tables.
> 5. Restored the unwind() function to reflect the kernel
>    version in that it new uses a new find_table() routine,
>    which returns a pointer to the local copy of the unwind
>    that contains the incoming pc.
> 6. Cleaned up a bunch of cruft...
>

Hi Dave

On the panic task, when we do the following:

   set unwind on
   bt
   set unwind off
   bt

This last bt does not give us the same backtrace as what we get when crash 
first starts up(ie unwind is off by default). What is happening here is, when 
unwind is set to on, and we do a 'bt', we go to get_netdump_regs_x86_64() to get rsp and rip, where ASSIGN_SIZE(user_regs_struct) happens, thereby setting 
VALID_STRUCT(user_regs_struct) to 1. Now when we next do 'set unwind off' and 
'bt', we satisfy the following if condition in get_netdump_regs_x86_64() as 
VALID_STRUCT(user_regs_struct) is set:

 if (((NETDUMP_DUMPFILE() || KDUMP_DUMPFILE()) &&
          VALID_STRUCT(user_regs_struct) && (bt->task == tt->panic_task)) ||              (KDUMP_DUMPFILE() && (kt->flags & DWARF_UNWIND) &&
          (bt->flags & BT_DUMPFILE_SEARCH))) {

So this results in it reading the register values from the NT_PRSTATUS.
Hence the backtrace looks different from what we get from the existing
non-dwarf mechanism.

To avoid this, we could use a local variable for the user_regs_struct size
instead of changing things at the global scope with ASSIGN_SIZE(). Or 
invalidate the user_regs_struct before we leave from get_netdump_regs_x86_64().

Or, if it is desired that registers be read for the panic task from the 
NT_PRSTATUS section in the normal non-dwarf backtrace mechanism (which 
currently does not work as expected because of the user_regs_struct 
initialisation problem in x86_64), then probably it will have to be fixed
some other way.

Thanks
Rachita




More information about the Crash-utility mailing list