[Crash-utility] [RFC] gcore subcommand: a process coredump feature

S.Iguchi iguchi.sg at ncos.nec.co.jp
Thu Aug 5 02:02:33 UTC 2010


Hi,

From: HATAYAMA Daisuke <d.hatayama at jp.fujitsu.com>
Subject: Re: [Crash-utility] [RFC] gcore subcommand: a process coredump feature
Date: Tue, 03 Aug 2010 15:17:00 +0900 (東京 (標準時))

> Hello Iguchi-san,
> 
> Thanks for your comments.
> 
> From: "S.Iguchi" <iguchi.sg at ncos.nec.co.jp>
> Subject: Re: [Crash-utility] [RFC] gcore subcommand: a process coredump feature
> Date: Tue, 03 Aug 2010 13:10:09 +0900 (JST)
> 
> > Hi, Hatayama-san
> > 
> > I have a mostly same purpose extension with your patch.
> > But your patch is great! , because supporting latest kernel and 
> > also dump filter masking.
> > 
> > my current extention file is attached.
> > Yes, my code is quite buggy, ugly and not enough against latest kernel
> > than yours.
> > (sigh ... I didnot know fill_vma_cache(), so do "vm -p" everytime before dump.)
> > 
> > BTW, I have some comments.
> > I'd like to add some features below to yours. 
> > or if you will do, it is happy for me. :) 
> > 
> > - support i386 
> > - support elf32 binary on x86-64 
> > - support old kernel (before 2.6.17)
> > 
> > as Dave said, if your patch committed as extension,
> > I could submit some patches to that.
> > 
> > How about this?
> 
> As I've written in the first entry, I have a plan to support RHEL4,
> RHEL5 and RHEL6 on i386, x86_64 and IA64, and the latest upstream
> kernel, too. Next table shows correspondence of community's kernel
> versions.
> 
>    RHEL4  RHEL5   RHEL6   upstream
>   ---------------------------------
>    2.6.9  2.6.18  2.6.32  2.6.35
> 
> So, it could probably be enough for your first and third requests.
> 

Ugh, i didnt check RHEL4 ... sorry.
thank you for your explanation.

> On the other hand, I've not planned to support ia32 emulation over
> both x86_64 and ia64.
> 

OK.
it is enough for me to support ia32 emulation on x86-64 ...

if your extension applied, I'll think about it.

Thanks.

Regards,

Seigo Iguchi

> > 
> > Best regards,
> > Seigo Iguchi
> > 
> > 
> > From: HATAYAMA Daisuke <d.hatayama at jp.fujitsu.com>
> > Subject: [Crash-utility] [RFC] gcore subcommand: a process coredump feature
> > Date: Mon, 02 Aug 2010 18:00:02 +0900	(東京 (標準時))
> > 
> >> Hello,
> >> 
> >> For some weeks I've developed gcore subcommand for crash utility which
> >> provides process coredump feature for crash kernel dump, strongly
> >> demanded by users who want to investigate user-space applications
> >> contained in kernel crash dump.
> >> 
> >> I've now finished making a prototype version of gcore and found out
> >> what are the issues to be addressed intensely. Could you give me any
> >> comments and suggestions on this work?
> >> 
> >> 
> >> Motivation
> >> ==========
> >> 
> >> It's a relatively familiar technique that in a cluster system a
> >> currently running node triggers crash kernel dump mechanism when
> >> detecting a kind of a critical error in order for the running, error
> >> detecting server to cease as soon as possible. Concequently, the
> >> residual crash kernel dump contains a process image for the erroneous
> >> user application. At the case, developpers are interested in user
> >> space, rather than kernel space.
> >> 
> >> There's also a merit of gcore that it allows us to use several
> >> userland debugging tools, such as GDB and binutils, in order to
> >> analyze user space memory.
> >> 
> >> 
> >> Current Status
> >> ==============
> >> 
> >> I confirm the prototype version runs on the following configuration:
> >> 
> >>   Linux Kernel Version: 2.6.34
> >>   Supporting Architecture: x86_64
> >>   Crash Version: 5.0.5
> >>   Dump Format: ELF
> >> 
> >> I'm planning to widen a range of support as follows:
> >> 
> >>   Linux Kernel Version: Any
> >>   Supporting Architecture: i386, x86_64 and IA64
> >>   Dump Format: Any
> >> 
> >> 
> >> Issues
> >> ======
> >> 
> >> Currently, I have issues below.
> >> 
> >> 1) Retrieval of appropriate register values
> >> 
> >> The prototype version retrieves register values from a _wrong_
> >> location: a top of the kernel stack, into which register values are
> >> saved at any preemption context switch. On the other hand, the
> >> register values that should be included here are the ones saved at
> >> user-to-kernel context switch on any interrupt event.
> >> 
> >> I've yet to implement this. Specifically, I need to do the following
> >> task from now.
> >> 
> >>   (1) list all entries from user-space to kernel-space execution path.
> >> 
> >>   (2) divide the entries according to where and how the register
> >>   values from user-space context are saved.
> >> 
> >>   (3) compose a program that retrieves the saved register values from
> >>   appropriate locations that is traced by means of (1) and (2).
> >> 
> >> Ideally, I think it's best if crash library provides any means of
> >> retrieving this kind of register values, that is, ones saved on
> >> various stack frames. Is there such a plan to do?
> >> 
> >> 
> >> 2) Getting a signal number for a task which was during core dump
> >> process at kernel crash
> >> 
> >> If a target task is halfway of core dump process, it's better to know
> >> a signal number in order to know why the task was about to be core
> >> dumped.
> >> 
> >> Unfortunately, I have no choice but backtrace the kernel stack to
> >> retrieve a signal number saved there as an argument of, for example,
> >> do_coredump().
> >> 
> >> 
> >> 3) Kernel version compatibility
> >> 
> >> crash's policy is to support all kernel versions by the latest crash
> >> package. On the other hand, the prototype is based on kernel 2.6.34.
> >> This means more kernel versions need to be supported.
> >> 
> >> Well, the question is: to what versions do I need to really test in
> >> addition to the latest upstream kernel? I think it's practically
> >> enough to support RHEL4, RHEL5 and RHEL6.
> >> 
> >> 
> >> Build Instruction
> >> =================
> >> 
> >>   $ tar xf crash-5.0.5.tar.gz
> >>   $ cd crash-5.0.5/
> >>   $ patch -p 1 < gcore.patch
> >>   $ make
> >> 
> >> 
> >> Usage
> >> =====
> >> 
> >> Use help subcommand of crash utility as ``help gcore''.
> >> 
> >> 
> >> Attached File
> >> =============
> >> 
> >>   * gcore.patch
> >> 
> >>     A patch implementing gcore subcommand for crash-5.0.5.
> >> 
> >>     The diffstat output is as follows.
> >> 
> >> $ diffstat gcore.patch
> >>  Makefile      |   10 +-
> >>  defs.h        |   15 +
> >>  gcore.c       | 1858 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>  gcore.h       |  639 ++++++++++++++++++++
> >>  global_data.c |    3 +
> >>  help.c        |   28 +
> >>  netdump.c     |   27 +
> >>  tools.c       |   37 ++
> >>  8 files changed, 2615 insertions(+), 2 deletions(-)
> >> 
> >> --
> >> HATAYAMA Daisuke
> >> d.hatayama at jp.fujitsu.com
> 




More information about the Crash-utility mailing list