[Crash-utility] Re: crash enhancements proposal

Maneesh Soni maneesh at in.ibm.com
Mon May 8 09:26:07 UTC 2006


On Fri, May 05, 2006 at 10:03:54AM -0400, Dave Anderson wrote:
> Maneesh Soni wrote:
> 
> > Hi Dave,
> >
> > Following is a list of a few proposed improvements to crash utility though
> > for most of the items there are no names associated.
> >
> > Please let us know if these look useful or not. And if found appropriate
> > would it be possible for you to merge these with the crash todo list.
> >
> > Thanks to Badari Pulavarty, Richard Moore and Vara Prasad for the inputs.
> >
> > Regards
> > Maneesh
> >
> > --------------------------------------------------------------------------------
> > DESCRIPTION:
> >    clean & correct stack back traces on platforms ALL the time.
> >        - x86_64 (currently wrong and need fixing)
> >        - frame pointers off ? (on x86 we still don't have frame pointers on)
> >
> > RESOLUTION STATUS: Work-in-progress by Rachita Kothiyal <rachita at in.ibm.com>
> 
> Certainly a welcome task.  I suggest segregating the code in a separate
> file (as done with lkcd_x86_trace.c), and the new entry point can simply
> be plugged into machdep->back_trace function pointer at init time.
> There should also be an "out" to allow it to be set back to use the
> current x86_64_low_budget_back_trace_cmd().  Also, if it doesn't
> support -fomit-frame-pointer, it's not worth doing.
> 
ok, thanks for the suggestion. I have added this in the modified list 
appended below and request Rachita to keep this in mind.

> >
> > --------------------------------------------------------------------------------
> >
> > DESCRIPTION:
> >     Code restructuring:
> >     - move as much code for advanced commands to libraries so that
> >       crash is at least able to open the dump image and perform minimal
> >       set of commands like bt, dump dmesg log, disassemble etc. irrespective
> >       of kernel version.
> >     - code is hard to read & understand - need to re-write some of the
> >       basic subsystems like memory mapping, pagetable management etc
> >
> > RESOLUTION STATUS:
> >         Work-in-progress by Dave Wilder <dwilder at us.ibm.com> and
> >         Maneesh Soni <maneesh at in.ibm.com>
> >
> 
> I don't quite understand how moving code to libraries is going to
> achieve the goal here.  Things in some of the various *_init() functions
> could certainly be streamlined (or skipped) in order to make it more
> likely to make it to the first prompt.  For example, the task table initialization
> could be made to simply fill in the context data for just the panic task.
> (But it almost sounds like you just want to use gdb alone for the minimal
> set of commands you've listed?)
> 

The main aim is to have crash atleast make it to the first prompt. And for
advanced commands either we can try postponing *_init() function till the
first invocation or keep them in libraries.

> As far as "re-writes" are concerned, please keep in mind the
> necessity of backwards-compatibility.  I'd much rather keep the current
> code -- that's known to work -- in place, and if you come up with
> something new, or re-shuffled, make it only callable when the kernel
> is of a known kernel version or later.
> 
> The point is, let's not just re-invent the wheel just for purpose of
> re-inventing the wheel.
> 
Agreed, backward compatibility should be maintained.

> 
> >
> > --------------------------------------------------------------------------------
> >
> > DESCRIPTION:
> >     Crash & kernel version independence:
> >     kernel headers & code - reuse ? It would be nice to figure
> >     out a way to include kernel headers and sections of kernel code
> >     to do hard stuff (like memory mapping functions page_to_pfn,
> >     pfn_to_page, pagetable decoding etc..).
> >
> > RESOLUTION STATUS:
> >         Work-in-progress by Dave Wilder <dwilder at us.ibm.com> and
> >         Maneesh Soni <maneesh at in.ibm.com>
> >
> 
> I don't particularly like this suggestion.  (I thought we just went through
> a problem where Ubuntu kernels don't even have kernel headers?)
> 
> As far as code reuse, we already do that in a number of places, so
> I guess that's OK.
> 
> And there is just never seems to be a "one-size-fits-all" set of
> kernel functions/macros that covers all bases over the life of
> the kernel and each processor type.
> 
> But as always, I'm open to suggestion.
>
Actually final form of the solution is still not decided. It could be kernel
headers, or some binary (library). There has been some patches regarding
"make headers_install" which need some investigation also to see if they
can be of some help here.

<snip>
 >
> > DESCRIPTION:
> >    per-cpu info (like stacks traces)
> >
> > RESOLUTION STATUS: TBD
> 
> Needs more of a description...
> 
Badari, I guess you meant some command to dump all per-cpu data for each cpu? 
or some specific data. Please correct me here.

> >
> > --------------------------------------------------------------------------------
> > DESCRIPTION:
> >      User space enhancements
> >      - show user space stack backtrace, if present in the dump file,
> >      - ability to link user space namelist (debug object files),
> >
> > RESOLUTION STATUS: TBD
> >
> 
> I thought crash was a kernel [crash/live-system] analyzer?
> 
> You currently can add user-space debug data with "add-symbol-file",
> which loads the debug data and symbols into gdb.  I have done this
> kind of thing, but it's been an "almost-never" kind of situation, where
> I've wanted to display a user program's data structure.
> 
> But if you want to start throwing in this kind of user-space stuff,
> please just keep it segregated.

ok, I hope keeping all such commands in separate library should not be 
objectionable. 


> >
> > --------------------------------------------------------------------------------
> >
> > DESCRIPTION:
> >     Platform specific enhancements
> 
> >     - Establish CPU registers at the time of exceptions in the current context
> >     - Ability to handle CPU registers from current context using symbols in
> >       expressions
> >     - Ability to format basic processor structures like LDT, GDT, task gates
> >       for x86 arch
> >
> 
> Not clear on what "establishing" CPU registers means.   We already
> dump exception frames.
> 
> I guess you mean to be able to use a register connotation in certain
> commands, as opposed to the address contained in the register?
Right. We can have commands to dump specific register contents or use registers as arguments to some commands.

> That's potentially messy, because it puts processor-specific stuff
> in processor-neutral code.
>
Probably we can use extended libraries for such command to reduce the clutter.
 
> As far as the LDT, GDB, task gates formatting, that's fine.


> >
> > RESOLUTION STATUS: TBD
> >
> > --------------------------------------------------------------------------------
> >
> > DESCRIPTION:
> >      cross architecture support for crash
> >
> > RESOLUTION STATUS: TBD
> 
> No way -- we've been through this before.  It is essentially a complete re-write.
> 
> If you want this, make a new command entirely.
> 
> 
Ok, but this looks like high on priority for some and low for some. In anycase
this should be done in acceptable way.

> >
> >
> > --------------------------------------------------------------------------------
> 
> I've made my personal feelings on these kinds of things before,
> which is to take a "minimalist" approach.  Every new bell and whistle
> is virtually guaranteed to break as the kernel churns.  And they all
> require an additional support burden.  If I had my druthers, crash
> would have less rather than more at this point.
> 
> But I understand that this has become a community project, and
> with the few exceptions above, I'm open to all patch suggestions.
> 

Thanks. I have appended the modified list below, keeping you suggestions in
mind.

Maneesh



--------------------------------------------------------------------------------
DESCRIPTION:
	clean & correct stack back traces on platforms ALL the time.
	- x86_64 (currently wrong and need fixing)
	- segregate the code in a separate file (as done with lkcd_x86_trace.c),
	  and the new entry point can simply be plugged into machdep->back_trace 
	  function pointer at init time.
	- There should also be an "out" to allow it to be set back to use the
	  current x86_64_low_budget_back_trace_cmd().  Also, if it doesn't
	  support -fomit-frame-pointer, it's not worth doing.

RESOLUTION STATUS: Work-in-progress by Rachita Kothiyal <rachita at in.ibm.com>

--------------------------------------------------------------------------------

DESCRIPTION:
	Code restructuring:
	- streamline the *_init() functions so as that crash is at least
	  able to open the dump image and perform minimal set of commands
          like bt, dump dmesg log, disassemble etc. irrespective of kernel
	  version.
	- code is hard to read & understand - need to re-write some of the 
	  basic subsystems like memory mapping, pagetable management etc 
	  maintaining the backward compatibility.

RESOLUTION STATUS: 
	Work-in-progress by Dave Wilder <dwilder at us.ibm.com> and 
	Maneesh Soni <maneesh at in.ibm.com>

--------------------------------------------------------------------------------

DESCRIPTION:
	Crash & kernel version independence:
	- kernel headers & code - reuse ? It would be nice to figure
	  out a way to include kernel headers and sections of kernel code
	  to do hard stuff (like memory mapping functions page_to_pfn,
	  pfn_to_page, pagetable decoding etc..).

RESOLUTION STATUS: 
	Work-in-progress by Dave Wilder <dwilder at us.ibm.com> and 
	Maneesh Soni <maneesh at in.ibm.com>

--------------------------------------------------------------------------------

DESCRIPTION:
	Mini report: 
	- The goal of this is to produce a summary report of common information
	  that is used to track problems. The idea here is for many problems we
	  probably don't need to get the whole dump shipped and as you probably
	  figured out by now it is not easy to ship and store these huge dump
	  files. 

RESOLUTION STATUS: TBD

--------------------------------------------------------------------------------

DESCRIPTION:
	Automatic verification of the dump:
	- When you get a dump to look at problem there are few common tasks
	  one performs, the idea here is to automate those tasks and provide
	  a simple interface in the tool. Another possibility is automatic
	  verification of important datastructures, for example if the task
	  list says there are 30 tasks this feature automatically walks the
	  list and counts to verify if there are 30 in the list or not, if 30
	  entries or not found this may give a clue of some kind of a
	  corruption. 

RESOLUTION STATUS: TBD

--------------------------------------------------------------------------------

DESCRIPTION:
	function arguments:
	- Display arguments in the stack trace. At present, we do not have
	  support for PPC64 and x86_64. On PPC64, user can dump retrieve only
	  for top level frame from pt_regs. However, user can dump complete
	  stack frame and read arguments. So, it is manual process and need
	  to have some expertise on the stack frame

RESOLUTION STATUS: TBD

--------------------------------------------------------------------------------

DESCRIPTION:
	local variables:
	- Facilitate possible display of local variables with stack frames
	  Since we are using debug vmlinux, we can find local variables
	  locations from Dwarf2.

RESOLUTION STATUS: TBD

--------------------------------------------------------------------------------

DESCRIPTION:
	better assembly & source languge, line# display in disassembly
        - interacting with gdb might help as for any text address, gdb has
          the associated line number data but there might be some confusion
	  depending up the source of text.

RESOLUTION STATUS: TBD

--------------------------------------------------------------------------------

DESCRIPTION:
	per-cpu info (like stacks traces)
	- Display all or specific per cpu data for all cpus or specific cpu.

RESOLUTION STATUS: TBD

--------------------------------------------------------------------------------
DESCRIPTION:
	User space enhancements
	- show user space stack backtrace, if present in the dump file,
	- ability to link user space namelist (debug object files), 

RESOLUTION STATUS: TBD

--------------------------------------------------------------------------------

DESCRIPTION:
	Platform specific enhancements
	- Establish CPU registers at the time of exceptions in the current
	  context  
	- Ability to handle CPU registers from current context using symbols
	  in expressions
	- Ability to format basic processor structures like LDT, GDT, task
	  gates for x86 arch 

RESOLUTION STATUS: TBD

--------------------------------------------------------------------------------

DESCRIPTION:
	cross architecture support for crash 

RESOLUTION STATUS: TBD

--------------------------------------------------------------------------------


DESCRIPTION:
	scripting support 
	- integrating scripting support with perl or python like language, 
	  "Alicia" can be one example or the solution itself.

RESOLUTION STATUS: TBD




More information about the Crash-utility mailing list