[Crash-utility] crash fails to load compressed RHEL5 vmcore

Dave Anderson anderson at redhat.com
Fri Jun 25 20:22:16 UTC 2010


----- "marc pascual" <marc.m.pascual at gmail.com> wrote:

> Hello,
> 
> I have this issue with compressed RHEL5 vmcore files, I can't get it
> to loaded to the crash utility. I don't have this problem with RHEL4 vmcores (diskdump).
> The debuginfo kernel exactly matches the kernel version that generated
> the vmcore
> 
> [root at test-fc12 ~]# crash usr/lib/debug/lib/modules/2.6.18-92.1.10.el5debug/vmlinux /nas01/cores/2.6.18-92.1.10.el5.vmcore
> 
> crash 4.0.9-2.fc12
> Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009 Red Hat, Inc.
> Copyright (C) 2004, 2005, 2006 IBM Corporation
> Copyright (C) 1999-2006 Hewlett-Packard Co
> Copyright (C) 2005, 2006 Fujitsu Limited
> Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
> Copyright (C) 2005 NEC Corporation
> Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
> Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
> This program is free software, covered by the GNU General Public License,
> and you are welcome to change it and/or distribute copies of it under
> certain conditions. Enter "help copying" to see the conditions.
> This program has absolutely no warranty. Enter "help warranty" for
> details.
> 
> GNU gdb 6.1
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB. Type "show warranty" for details.
> This GDB was configured as "x86_64-unknown-linux-gnu"...
> 
> crash: page excluded: kernel virtual address: ffffffff804f1260 type: "possible"
> WARNING: cannot read cpu_possible_map
> crash: usr/lib/debug/lib/modules/2.6.18-92.1.10.el5debug/vmlinux and
> /nas01/cores/2.6.18-92.1.10.el5.vmcore do not match!
> 
> Usage:
> crash [-h [opt]][-v][-s][-i file][-d num] [-S] [mapfile] [namelist]
> [dumpfile]
> 
> Enter "crash -h" for details.
> 
> [root at test-fc12 ~]# strings usr/lib/debug/lib/modules/2.6.18-92.1.10.el5debug/vmlinux | grep 2.6 | head -2
> Linux version 2.6.18-92.1.10.el5debug (brewbuilder at ls20-bc2-13.build.redhat.com ) (gcc version 4.1.2 20071124 (Red Hat 4.1.2-42)) #1 SMP Wed Jul 23 04:27:38 EDT 2008
> 
> running strings on the rhel5 vmcore file:
> 
> [root at test-fc12 cores]# strings 2.6.18-92.1.10.el5.vmcore | grep 2.6 |
> head -2
> 2X6
> 2T6(

If the diskdump was compressed (and with "makedumpfile -c" it is), then unfortunately
looking for the "Linux version" string won't help.  The utsname data may be in the 
compressed kdump header in more recent versions, so if you entered this:

  # crash -d1 vmlinux vmcore
  ...

then the dumpfile header will be immediately dumped, and the utsname has most of
the relevant data from the "Linux version" string:

  diskdump_data: 
          filename: (null)
             flags: 6 (KDUMP_CMPRS_LOCAL|ERROR_EXCLUDED)
               dfd: 3
               ofp: 0
      machine_type: 62 (EM_X86_64)

            header: 19c42fe0
           signature: "KDUMP   "
      header_version: 1
             utsname:
               sysname: Linux
              nodename: hp-dl585g2-01.rhts.bos.redhat.com
               release: 2.6.18-164.el5
               version: #1 SMP Tue Aug 18 15:51:48 EDT 2009
               machine: x86_64
            domainname: (none)

But this was fixed fairly recently in makedumpfile, your dump may show
a bunch of (null) strings for the utsname data.

One thing I notice is that the vmlinux file is "2.6.18-92.1.10.el5debug",
and although you haven't shown exactly what kernel the crashed kernel
was running, you've named it "2.6.18-92.1.10.el5.vmcore".  If the crashed
kernel was running 2.6.18-92.1.10.el5, then you're using the wrong vmlinux
file, and you should be using:

  /usr/lib/debug/lib/modules/2.6.18-92.1.10.el5/vmlinux

They 2.6.18-92.1.10.el5 and 2.6.18-92.1.10.el5debug kernels are completely
different kernels.

If we presume that the crashed kernel and the vmlinux are *both* the "debug"
variety, then if you extend the debug data output by entering -d4, you'll see
every read attempt made from the dumpfile:

  # crash -d4 vmlinux vmcore
  ... [ snip ] ...
  GNU gdb (GDB) 7.0
  Copyright (C) 2009 Free Software Foundation, Inc.
  License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
  This is free software: you are free to change and redistribute it.
  There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
  and "show warranty" for details.
  This GDB was configured as "x86_64-unknown-linux-gnu"...

  <readmem: ffffffff80447280, KVADDR, "possible", 32, (ROE), c8fce0>
  cpu_possible_map: 0 1 2 3 4 5 6 7
  <readmem: ffffffff803ed5a0, KVADDR, "present", 32, (ROE), c8fce0>
  cpu_present_map: 0 1 2 3 4 5 6 7
  <readmem: ffffffff803e8260, KVADDR, "online", 32, (ROE), c8fce0>
  cpu_online_map: 0 1 2 3 4 5 6 7
  <readmem: ffffffff803ef200, KVADDR, "xtime", 16, (FOE), b50130>
  <readmem: ffffffff80301320, KVADDR, "system_utsname", 390, (ROE), b5071c>
  ...

So, you can see that the very first readmem() is the cpu_possible_map bitmap.
And in your case:

> crash: page excluded: kernel virtual address: ffffffff804f1260 type: "possible"
> WARNING: cannot read cpu_possible_map

That first readmem() attempt failed because the page was explicitly
excluded by makedumpfile.  But if makedumpfile's page exclusion mechanism
excluded the page containing that kernel data, I'd be very surprised.

Dave

        
> does makedumpconfig's compression have something to do with this? from
> kdump.conf on the machine where i got that vmcore:
> ...
> core_collector makedumpfile -c -d31
> ...
> 
> Thank you in advace!
> 
> Regards,
> Marc
>




More information about the Crash-utility mailing list