[Crash-utility] crash tool not working

Dave Anderson anderson at redhat.com
Fri Feb 10 15:55:49 UTC 2012



----- Original Message -----
> Thanks Dave.
> 
> > If you had downloaded crash-6.0.3-0.src.rpm instead of the tar.gz
> > file, it would have prevented the build because the crash.spec file
> > requires these two packages:
> >
> >  BuildRequires: ncurses-devel zlib-devel
> >
> > Anyway, those two packages are required.
> 
> I installed zlib-devel and build was successful.
> 
> > If you can upgrade, and then post the output of:
> >
> > $ crash -d8 linux-2.6.32.12-0.7/vmlinux
> > /var/crash/2012-02-08-14\:13/vmcore
> > there will be a plethora of debug output that can help determine
> > the problem.
> 
> Please find the output below:
> 
> ---------------------------
> hltncra110731:/home/adil/crash-6.0.3 # ./crash -d8
> ../linux-2.6.32.12-0.7/vmlinux /var/crash/2012-02-08-14\:13/vmcore
> 
> crash 6.0.3
> Copyright (C) 2002-2012  Red Hat, Inc.
> Copyright (C) 2004, 2005, 2006  IBM Corporation
> Copyright (C) 1999-2006  Hewlett-Packard Co
> Copyright (C) 2005, 2006  Fujitsu Limited
> Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
> Copyright (C) 2005  NEC Corporation
> Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
> Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
> This program is free software, covered by the GNU General Public
> License,
> and you are welcome to change it and/or distribute copies of it under
> certain conditions.  Enter "help copying" to see the conditions.
> This program has absolutely no warranty.  Enter "help warranty" for
> details.
> 
> compressed kdump: header->utsname.machine:
> compressed kdump: memory bitmap offset: 2000
> diskdump_data:
>           filename: /var/crash/2012-02-08-14:13/vmcore
>              flags: 6 (KDUMP_CMPRS_LOCAL|ERROR_EXCLUDED)
>                dfd: 3
>                ofp: 0
>       machine_type: 62 (EM_X86_64)
> 
>             header: dedfe0
>            signature: "KDUMP   "
>       header_version: 1
>              utsname:
>                sysname:
>               nodename:
>                release:
>                version:
>                machine:
>             domainname:
>            timestamp:
>                 tv_sec: 0
>                tv_usec: 0
>               status: 0 ()
>           block_size: 4096
>         sub_hdr_size: 1
>        bitmap_blocks: 76
>            max_mapnr: 1245184
>     total_ram_blocks: 0
>        device_blocks: 0
>       written_blocks: 0
>          current_cpu: 0
>              nr_cpus: 1
>       tasks[nr_cpus]: 0
> 
>         sub_header: 0 (n/a)
> 
>   sub_header_kdump: deeff0
>            phys_base: 0
>           dump_level: 0 (0x0)
> 
>        data_offset: 4e000
>         block_size: 4096
>        block_shift: 12
>             bitmap: 7f99c0b4e010
>         bitmap_len: 311296
>    dumpable_bitmap: 7f99c0b01010
>               byte: 0
>                bit: 0
>    compressed_page: e009a0
>          curbufptr: 0
> 
>  page_cache_hdr[0]:
>             pg_flags: 0 ()
>              pg_addr: 0
>            pg_bufptr: df0990
>         pg_hit_count: 0
>  page_cache_hdr[1]:
>             pg_flags: 0 ()
>              pg_addr: 0
>            pg_bufptr: df1990
>         pg_hit_count: 0
>  page_cache_hdr[2]:
>             pg_flags: 0 ()
>              pg_addr: 0
>            pg_bufptr: df2990
>         pg_hit_count: 0
>  page_cache_hdr[3]:
>             pg_flags: 0 ()
>              pg_addr: 0
>            pg_bufptr: df3990
>         pg_hit_count: 0
>  page_cache_hdr[4]:
>             pg_flags: 0 ()
>              pg_addr: 0
>            pg_bufptr: df4990
>         pg_hit_count: 0
>  page_cache_hdr[5]:
>             pg_flags: 0 ()
>              pg_addr: 0
>            pg_bufptr: df5990
>         pg_hit_count: 0
>  page_cache_hdr[6]:
>             pg_flags: 0 ()
>              pg_addr: 0
>            pg_bufptr: df6990
>         pg_hit_count: 0
>  page_cache_hdr[7]:
>             pg_flags: 0 ()
>              pg_addr: 0
>            pg_bufptr: df7990
>         pg_hit_count: 0
>  page_cache_hdr[8]:
>             pg_flags: 0 ()
>              pg_addr: 0
>            pg_bufptr: df8990
>         pg_hit_count: 0
>  page_cache_hdr[9]:
>             pg_flags: 0 ()
>              pg_addr: 0
>            pg_bufptr: df9990
>         pg_hit_count: 0
> page_cache_hdr[10]:
>             pg_flags: 0 ()
>              pg_addr: 0
>            pg_bufptr: dfa990
>         pg_hit_count: 0
> page_cache_hdr[11]:
>             pg_flags: 0 ()
>              pg_addr: 0
>            pg_bufptr: dfb990
>         pg_hit_count: 0
> page_cache_hdr[12]:
>             pg_flags: 0 ()
>              pg_addr: 0
>            pg_bufptr: dfc990
>         pg_hit_count: 0
> page_cache_hdr[13]:
>             pg_flags: 0 ()
>              pg_addr: 0
>            pg_bufptr: dfd990
>         pg_hit_count: 0
> page_cache_hdr[14]:
>             pg_flags: 0 ()
>              pg_addr: 0
>            pg_bufptr: dfe990
>         pg_hit_count: 0
> page_cache_hdr[15]:
>             pg_flags: 0 ()
>              pg_addr: 0
>            pg_bufptr: dff990
>         pg_hit_count: 0
> 
>     page_cache_buf: df0990
>        evict_index: 0
>          evictions: 0
>           accesses: 0
>       cached_reads: 0
>        valid_pages: df0000
> readmem: read_diskdump()
> crash: pv_init_ops exists: ARCH_PVOPS
> compressed kdump: phys_base: 0
> gdb ../linux-2.6.32.12-0.7/vmlinux
> GNU gdb (GDB) 7.3.1
> Copyright (C) 2011 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later
> <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.  Type "show
> copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-unknown-linux-gnu"...
> GETBUF(248 -> 0)
>   GETBUF(1500 -> 1)
> 
>   FREEBUF(1)
> FREEBUF(0)
> <readmem: ffffffff82827aa0, KVADDR, "kernel_config_data", 32768,
> (ROE), 1a1bab0>
> <read_diskdump: addr: ffffffff82827aa0 paddr: 2827aa0 cnt: 1376>
> read_diskdump: SEEK_ERROR: paddr/pfn: 2827aa0/2827 !page_is_ram
> crash: seek error: kernel virtual address: ffffffff82827aa0  type:
> "kernel_config_data"
> WARNING: cannot read kernel_config_data
> GETBUF(248 -> 0)
> FREEBUF(0)
> GETBUF(512 -> 0)
> <readmem: ffffffff8281a660, KVADDR, "cpu_possible_mask", 8, (FOE),
> 7fff1b538678>
> <read_diskdump: addr: ffffffff8281a660 paddr: 281a660 cnt: 8>
> read_diskdump: SEEK_ERROR: paddr/pfn: 281a660/281a !page_is_ram
> crash: seek error: kernel virtual address: ffffffff8281a660  type:
> "cpu_possible_mask"
> hltncra110731:/home/adil/crash-6.0.3 #


The debug output looks reasonable, and it doesn't seem to 
be a relocation issue because the dumpfile indicates this:

  sub_header_kdump: deeff0
           phys_base: 0
          dump_level: 0 (0x0)

and 

  compressed kdump: phys_base: 0

So for the first two readmem() attempts, at ffffffff82827aa0
(kernel_config_data) and ffffffff8281a660 (cpu_possible_mask), 
the kernel start map identifier of ffffffff80000000 is stripped,
leaving the physical addresses, which are then passed to the 
compressed-kdump function read_diskdump().  But those physical 
addresses cannot be found in the dumpfile.

I am presuming that the machine that generated the vmcore is still
running your ../linux-2.6.32.12-0.7/vmlinux kernel, and that you
can log onto it as root.  If that's not the case, I can't help
much more.

If you are running that kernel on the crashed machine, for sanity's sake,
can you verify that your kernel has not relocated itself by doing this:

  # nm -Bn ../linux-2.6.32.12-0.7/vmlinux | grep _stext

and comparing it to:

  # cat /proc/kallsyms | grep _stext

The symbol values should be the same.  For example, on this 2.6.32-based 
system I see this:

  # nm -Bn /usr/lib/debug/lib/modules/2.6.32-70.el6.x86_64/vmlinux | grep _stext
  ffffffff81009000 T _stext
  # cat /proc/kallsyms | grep _stext
  ffffffff81009000 T _stext
  #

Also do this:

  # cat /proc/kallsyms | grep -e kernel_config_data -e cpu_possible_mask

and confirm that they are ffffffff82827aa0 and ffffffff82827aa0.

And lastly, in all cases where a dumpfile cannot be read correctly,
first verify that crash works with the live system.  Presuming
t, and it was configured CONFIG_STRICT_DEVMEM, try this:

  # crash ../linux-2.6.32.12-0.7/vmlinux

If that comes up OK, then we can rule out more fundamental
failure causes.

Dave







More information about the Crash-utility mailing list