[Crash-utility] [ANNOUNCE][RFC] gcore extension module: user-mode process core dump

Dave Anderson anderson at redhat.com
Mon Jan 24 19:27:39 UTC 2011



----- Original Message -----
> gcore extension module provides a means to create ELF core dump for
> user-mode process that is contained within crash kernel dump. I design
> this to behave as kernel's ELF core dumper.
> 
> For previous discussion, see:
> https://www.redhat.com/archives/crash-utility/2010-August/msg00001.html

A few observations...

I'll fix unwind_x86_64.h to prevent this build warning:
  
  # make extensions
  ...
  gcc  -Wall -I.. -I./libgcore -fPIC -DX86_64 -c -o libgcore/gcore_x86.o libgcore/gcore_x86.c
  In file included from libgcore/gcore_x86.c:19:
  ../unwind_x86_64.h:61:1: warning: "offsetof" redefined
  In file included from libgcore/gcore_x86.c:17:
  ../defs.h:60:1: warning: this is the location of the previous definition
  ...

But the gcore.mk file should gracefully fail to build on non-supported
architectures.  It ends up spewing ~200 lines of error messages when
attempted, for example, on a ppc64 machine:
    
  # make extensions
  gcc -m64 -Wall -I.. -I./libgcore -fPIC -DPPC64 -c -o libgcore/gcore_coredump.o libgcore/gcore_coredump.c
  In file included from libgcore/gcore_coredump.c:17:
  ./libgcore/gcore_defs.h:355:1: warning: "ELF_NGREG" redefined
  In file included from /usr/include/asm/sigcontext.h:13,
                   from /usr/include/bits/sigcontext.h:28,
                   from /usr/include/signal.h:339,
                   from ../defs.h:38,
                   from libgcore/gcore_coredump.c:16:
  /usr/include/asm/elf.h:92:1: warning: this is the location of the previous definition
  In file included from libgcore/gcore_coredump.c:17:
  ./libgcore/gcore_defs.h:356: error: invalid application of ‘sizeof’ to incomplete type ‘struct user_regs_struct’ 
  ./libgcore/gcore_defs.h:356: error: conflicting types for ‘elf_gregset_t’
  /usr/include/asm/elf.h:124: note: previous declaration of ‘elf_gregset_t’ was here
  ./libgcore/gcore_defs.h:490: error: conflicting types for ‘__kernel_old_uid_t’
  /usr/include/asm/posix_types.h:28: note: previous declaration of ‘__kernel_old_uid_t’ was here
  ./libgcore/gcore_defs.h:491: error: conflicting types for ‘__kernel_old_gid_t’
  /usr/include/asm/posix_types.h:29: note: previous declaration of ‘__kernel_old_gid_t’ was here
  libgcore/gcore_coredump.c:25: error: expected ‘)’ before ‘*’ token
  libgcore/gcore_coredump.c:33: error: expected declaration specifiers or ‘...’ before ‘Elf_Ehdr’

  ... [ cut ] ...

  ./libgcore/gcore_defs.h:490: error: conflicting types for ‘__kernel_old_uid_t’
  /usr/include/asm/posix_types.h:28: note: previous declaration of ‘__kernel_old_uid_t’ was here
  ./libgcore/gcore_defs.h:491: error: conflicting types for ‘__kernel_old_gid_t’
  /usr/include/asm/posix_types.h:29: note: previous declaration of ‘__kernel_old_gid_t’ was here
  make[3]: [gcore.so] Error 1 (ignored)
  # 

Your documentation implies that the command would only work on 
certain kernel versions:

> Compared with the previous version, this release:
> - supports more kernel versions, and
> - collects register values more accurately (but still not perfect).
> 
> Support Range
> =============
> 
> |----------------+----------------------------------------------|
> | ARCH | X86, X86_64 |
> |----------------+----------------------------------------------|
> | Kernel Version | RHEL4.8, RHEL5.5, RHEL6.0 and Vanilla 2.6.36 |
> |----------------+----------------------------------------------|


But, for example, on a 2.6.34-2.fc14 kernel (presumably unsupported),
it seems to work OK on some tasks, but on others it doesn't work so well.
Here, the "less" command can be dumped OK kernel:


  crash> sys | grep RELEASE
       RELEASE: 2.6.34-2.fc14.x86_64
  crash> ps
  ... [ cut ] ...
  >  2080   1490   0  ffff880079ed2480  RU   7.6  289900 159684  crash
     2084      1   0  ffff880077a7a480  IN   0.1  248592   1936  rsyslogd
     2090   2080   5  ffff880079ed4900  IN   0.0  105432    828  less
  crash> gcore -v0 2090
  Saved core.2090.less
  crash>

But with the same (full) 2.6.34-2.fc14 dumpfile, it can't seem to handle 
dumping the crash utility itself, and just hangs:

  crash> swap
  FILENAME           TYPE         SIZE      USED   PCT  PRIORITY
  /dev/dm-1        PARTITION    18579452k       0k   0%     -1
  crash> ps
  ... [ cut ] ...
  >  2080   1490   0  ffff880079ed2480  RU   7.6  289900 159684  crash
     2084      1   0  ffff880077a7a480  IN   0.1  248592   1936  rsyslogd
     2090   2080   5  ffff880079ed4900  IN   0.0  105432    828  less
  crash> gcore -v1 2080
  gcore: Restoring the thread group ... 
  gcore: done.
  gcore: Retrieving note information ... 
  
  < hangs forever >

  ...

I would have thought that it would either work-for-all or work-for-none
with respect to a particular kernel version?

In any case, if it's going to fail, perhaps there should be some mechanism
in place that would prevent it from hanging, and instead print a message 
that the kernel version is not supported?  Or if a particular data structure
is different than the "supported" versions, it should fail immediately?  
Just a thought...

Also I note that "gcore -v7" fails -- shouldn't it be accepted as an argument?

  crash> gcore -v7 2080
  gcore: invalid vlevel: 7.
  crash>

Thanks,
  Dave

 
> TODO
> ====
> 
> I have still remaining tasks to do:
> - Improvement on register collection for active tasks
> - Improvement on callee-saved register collection on x86_64
> - Support core dump for tasks running in x86_32 compatibility mode
> 
> Usage
> =====
> 
> 1) Expand source files under extensions directory.
> 
> Arrange the attached source files as shown below:
> 
> ./extensions/gcore.c
> ./extensions/gcore.mk
> ./extensions/libgcore/gcore_coredump.c
> ./extensions/libgcore/gcore_coredump_table.c
> ./extensions/libgcore/gcore_defs.h
> ./extensions/libgcore/gcore_dumpfilter.c
> ./extensions/libgcore/gcore_global_data.c
> ./extensions/libgcore/gcore_regset.c
> ./extensions/libgcore/gcore_verbose.c
> ./extensions/libgcore/gcore_x86.c
> 
> 2) Type ``make extensions''; then, ``gcore.so'' is generated under
> extensions directory.
> 
> 3) Type ``extend gcore.so'' to load gcore extension module.
> 
> Look at help message for actual usage: I attach the help message at
> the end of this mail.
> 
> 4) Type ``extend -u gcore.so'' to unload gcore extension module.
> 
> Help Message
> ============
> 
> NAME
> gcore - gcore - retrieve a process image as a core dump
> 
> SYNOPSIS
> gcore
> gcore [-v vlevel] [-f filter] [pid | taskp]*
> This command retrieves a process image as a core dump.
> 
> DESCRIPTION
> 
> -v Display verbose information according to vlevel:
> 
> progress library error page fault
> ---------------------------------------
> 0
> 1 x
> 2 x
> 4 x (default)
> 7 x x x
> 
> -f Specify kinds of memory to be written into core dumps according to
> the filter flag in bitwise:
> 
> AP AS FP FS ELF HP HS
> ------------------------------
> 0
> 1 x
> 2 x
> 4 x
> 8 x
> 16 x x
> 32 x
> 64 x
> 127 x x x x x x x
> 
> AP Anonymous Private Memory
> AS Anonymous Shared Memory
> FP File-Backed Private Memory
> FS File-Backed Shared Memory
> ELF ELF header pages in file-backed private memory areas
> HP Hugetlb Private Memory
> HS Hugetlb Shared Memory
> 
> If no pid or taskp is specified, gcore tries to retrieve the process
> image
> of the current task context.
> 
> The file name of a generated core dump is core.<pid> where pid is PID
> of
> the specified process.
> 
> For a multi-thread process, gcore generates a core dump containing
> information for all threads, which is similar to a behaviour of the
> ELF
> core dumper in Linux kernel.
> 
> Notice the difference of PID on between crash and linux that ps
> command in
> crash utility displays LWP, while ps command in Linux thread group
> tid,
> precisely PID of the thread group leader.
> 
> gcore provides core dump filtering facility to allow users to select
> what
> kinds of memory maps to be included in the resulting core dump. There
> are
> 7 kinds memory maps in total, and you can set it up with set command.
> For more detailed information, please see a help command message.
> 
> EXAMPLES
> Specify the process you want to retrieve as a core dump. Here assume
> the
> process with PID 12345.
> 
> crash> gcore 12345
> Saved core.12345
> crash>
> 
> Next, specify by TASK. Here assume the process placing at the address
> f9d7000 with PID 32323.
> 
> crash> gcore f9d78000
> Saved core.32323
> crash>
> 
> If multiple arguments are given, gcore performs dumping process in the
> order the arguments are given.
> 
> crash> gcore 5217 ffff880136d72040 23299 24459 ffff880136420040
> Saved core.5217
> Saved core.1130
> Saved core.1130
> Saved core.24459
> Saved core.30102
> crash>
> 
> If no argument is given, gcore tries to retrieve the process of the
> current
> task context.
> 
> crash> set
> PID: 54321
> COMMAND: "bash"
> TASK: e0000040f80c0000
> CPU: 0
> STATE: TASK_INTERRUPTIBLE
> crash> gcore
> Saved core.54321
> 
> When a multi-thread process is specified, the generated core file name
> has
> the thread leader's PID; here it is assumed to be 12340.
> 
> crash> gcore 12345
> Saved core.12340
> 
> It is not allowed to specify two same options at the same time.
> 
> crash> gcore -v 1 1234 -v 1
> Usage: gcore
> gcore [-v vlevel] [-f filter] [pid | taskp]*
> gcore -d
> Enter "help gcore" for details.
> 
> It is allowed to specify -v and -f options in a different order.
> 
> crash> gcore -v 2 5201 -f 21 ffff880126ff9520 5205
> Saved core.5174
> Saved core.5217
> Saved core.5167
> crash> gcore 5201 ffff880126ff9520 -f 21 5205 -v 2
> Saved core.5174
> Saved core.5217
> Saved core.5167
> 
> Signed-off-by: HATAYAMA Daisuke <d.hatayama at jp.fujitsu.com>
> 
> 
> [Text File:gcore.c]
> 
> 
> [Text File:gcore.mk]
> 
> 
> [Text File:gcore_coredump.c]
> 
> 
> [Text File:gcore_coredump_table.c]
> 
> 
> [Text File:gcore_defs.h]
> 
> 
> [Text File:gcore_dumpfilter.c]
> 
> 
> [Text File:gcore_global_data.c]
> 
> 
> [Text File:gcore_regset.c]
> 
> 
> [Text File:gcore_verbose.c]
> 
> 
> [Text File:gcore_x86.c]




More information about the Crash-utility mailing list