[Crash-utility] using crash for ARM

paawan oza paawan1982 at yahoo.com
Mon Aug 13 09:08:19 UTC 2012


Hi, 


here is the log which I m trying from gdb now.
hope this stack trace helps.


./gdb --args ./crash ./vmlinux ./System.map                                       <
GNU gdb (GDB) 7.3.1
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "arm-none-linux-gnueabi".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /system/crash...done.
(gdb) start
Temporary breakpoint 1 at 0x8224: file main.c, line 78.
Starting program: /system/crash ./vmlinux ./System.map

Temporary breakpoint 1, main (argc=3, argv=0xbefffba4) at main.c:78
78      main.c: No such file or directory.
        in main.c
(gdb) bt
#0  main (argc=3, argv=0xbefffba4) at main.c:78
(gdb) b open_tmpfile2
Breakpoint 2 at 0x6d46c: file filesys.c, line 1126.
(gdb) c
Continuing.

crash 6.0.8
Copyright (C) 2002-2012  Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010  IBM Corporation
Copyright (C) 1999-2006  Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012  Fujitsu Limited
Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011  NEC Corporation
Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions.  Enter "help copying" to see the conditions.
This program has absolutely no warranty.  Enter "help warranty" for details.

GNU gdb (GDB) 7.3.1
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "arm-none-linux-gnueabi"...

WARNING: kernels compiled by different gcc versions:
  ./vmlinux: (unknown)
  live system kernel: 4.4.3


Breakpoint 2, open_tmpfile2 () at filesys.c:1126
1126    filesys.c: No such file or directory.
        in filesys.c
(gdb) bt
#0  open_tmpfile2 () at filesys.c:1126
#1  0x000f324c in anon_member_offset (name=0x433604 "page",
    member=0x433680 "mapping") at symbols.c:5104
#2  0x000f2c4c in datatype_info (name=0x433604 "page",
    member=0x433680 "mapping", dm=0xfffffffe) at symbols.c:4930
#3  0x000260f0 in vm_init () at memory.c:383
#4  0x0000b1a0 in main_loop () at main.c:644
#5  0x00173050 in current_interp_command_loop () at interps.c:288
#6  0x001749b0 in captured_command_loop (data=0x24) at ./main.c:228
#7  0x00172a50 in catch_errors (func=0x1749a8 <captured_command_loop>,
    func_args=0x0, errstring=0x506b88 "", mask=<optimized out>)
    at exceptions.c:531
#8  0x0017431c in captured_main (data=<optimized out>) at ./main.c:958
#9  0x00172a50 in catch_errors (func=0x173818 <captured_main>,
    func_args=0xbefff9d8, errstring=0x506b88 "", mask=<optimized out>)
    at exceptions.c:531
#10 0x00173628 in gdb_main (args=<optimized out>) at ./main.c:973
#11 gdb_main_entry (argc=<optimized out>, argv=<optimized out>)
    at ./main.c:993
#12 0x000bed6c in gdb_main_loop (argc=2, argv=0xbefffba4)
    at gdb_interface.c:76
#13 0x0000b014 in main (argc=3, argv=0xbefffba4) at main.c:604
(gdb) n
1129    in filesys.c
(gdb) c
Continuing.
tmpfile: No such file or directory
error no is 2
crash: cannot open secondary temporary file

[/system/crash] error trace: f2c4c => f324c => 6d508 => 13138
crash: /usr/bin/nm: no such file
DROP_CORE flag set: forcing a segmentation fault

Program received signal SIGQUIT, Quit.
0x003be42c in kill ()
(gdb)

PS: I am pretty sure that I am taking vmlinux from the same build and System.map as well, but that should not be a problem as long as crash had gone ahead.

Regards,
Oza.




________________________________
 From: Dave Anderson <anderson at redhat.com>
To: paawan oza <paawan1982 at yahoo.com> 
Cc: "Discussion list for crash utility usage, maintenance and development" <crash-utility at redhat.com> 
Sent: Friday, 10 August 2012 10:01 PM
Subject: Re: [Crash-utility] using crash for ARM
 


----- Original Message -----
> 
> Hi,
> 
> please find the logs attached for crash -d8 vmlinux System.map.
> 
> crash -d8 vmlinux doesnt work. it gives
> 
> crash 6.0.8
> Copyright (C) 2002-2012 Red Hat, Inc.
> Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
> Copyright (C) 1999-2006 Hewlett-Packard Co
> Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
> Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
> Copyright (C) 2005, 2011 NEC Corporation
> Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
> Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
> This program is free software, covered by the GNU General Public License,
> and you are welcome to change it and/or distribute copies of it under
> certain conditions. Enter "help copying" to see the conditions.
> This program has absolutely no warranty. Enter "help warranty" for
> details.
> 
> get_live_memory_source: /dev/mem
> WARNING: ./vmlinux and /proc/version do not match!
> 
> WARNING: /proc/version indicates kernel version: 3.0.15+
> 
> crash: please use the vmlinux file for that kernel version, or try using
> the System.map for that kernel version as an additional argument.
> 
> Regards,
> Oza.

For starters, as Mika suggested, you should try your best to use the
actual vmlinux file that is being run on the live system.  If the running
kernel's vmlinux file does not have debuginfo data, and you are using a
similar kernel along with the running kernel's System.map file, then you
must be sure that the "other" vmlinux that you are using is as close as
possible to the running kernel.  There are no guarantees that using a
System.map file will work.

Anyway, looking at the log file, I'm not sure why there's non-crash related
data intermingled with the crash -d8 output, i.e., like this:

  ...
  c00dfc08 clk_enable
  c00dfc50 clk_debug_set_enable
  c00dfcac clk_[ 1866.844757] ##> wifi_suspend
  [ 1866.856903] i2c i2c-1: mpu_dev_suspend, called regulator_disable. Status: 0
  [ 1866.856933] mpu_dev_suspend: Suspend handler executed
  [ 1866.872528] PM: suspend of devices complete after 27.886 msecs
  [ 1866.872558] PM: suspend devices took 0.030 seconds
  [ 1866.873077] PM: late suspend of devices complete after 0.457 msecs
  [ 1866.873535] PM: early resume of devices complete after 0.183 msecs
  [ 1866.874481] i2c i2c-1: mpu_dev_resume, called regulator_enable. Status: 0
  [ 1866.874511] mpu_dev_resume: Resume handler executed
  [ 1866.874572] wakeup wake lock: bcmpmu_i2c
  [ 1866.874908] get_update_rate: rate = 112000
  [ 1866.874938] get_update_rate: rate = 112000
  [ 1866.876434] ##> wifi_resume
  [ 1866.892822] PM: resume of devices complete after 19.007 msecs
  [ 1866.893676] PM: resume devices took 0.020 seconds
  reset
  c00dfd4c clk_debug_reset
  c00dfd90 clk_init
  c00dfe10 clk_register
  ...

Regardless of that, I was looking for a readmem() call, or other debug statement
that might help pinpoint the failure location.  The best that can be inferred from
the log data are the GNU_GET_DATATYPE debug statements at the end:

$ grep GNU_GET_DATATYPE teraterm.log
GNU_GET_DATATYPE[runqueue]: returned via gdb_error_hook (1 buffer in use)
GNU_GET_DATATYPE[prio_array]: returned via gdb_error_hook (1 buffer in use)
GNU_GET_DATATYPE[prio_array]: returned via gdb_error_hook (1 buffer in use)
GNU_GET_DATATYPE[prio_array]: returned via gdb_error_hook (1 buffer in use)
GNU_GET_DATATYPE[irq_desc_t]: returned via gdb_error_hook (1 buffer in use)
GNU_GET_DATATYPE[hw_interrupt_type]: returned via gdb_error_hook (1 buffer in use)
GNU_GET_DATATYPE[timer_vec_root]: returned via gdb_error_hook (1 buffer in use)
GNU_GET_DATATYPE[timer_vec]: returned via gdb_error_hook (1 buffer in use)
GNU_GET_DATATYPE[tvec_root_s]: returned via gdb_error_hook (1 buffer in use)
GNU_GET_DATATYPE[softirq_state]: returned via gdb_error_hook (1 buffer in use)
GNU_GET_DATATYPE[desc_struct]: returned via gdb_error_hook (1 buffer in use)
GNU_GET_DATATYPE[kallsyms_header]: returned via gdb_error_hook (1 buffer in use)
GNU_GET_DATATYPE[mem_section]: returned via gdb_error_hook (1 buffer in use)
GNU_GET_DATATYPE[note_buf_t]: returned via gdb_error_hook (1 buffer in use)
$ 

From those it's evident that you've successfully made it through kernel_init(),
and have called machdep_init(POST_GDB) from here in main.c:main_loop()

                } else if (!(pc->flags & MINIMAL_MODE)) {
                        read_in_kernel_config(IKCFG_INIT);
                        kernel_init();
      =====>            machdep_init(POST_GDB);
                        vm_init();
                        machdep_init(POST_VM);
                        module_init();
                        help_init();
                        task_init();
                        vfs_init();
                        net_init();
                        dev_init();
                        machdep_init(POST_INIT);
                }

which calls into arm.c:arm_init(POST_GDB).  That function has successfully made
it past the STRUCT_SIZE_INIT(note_buf, "note_buf_t") call:

               /*
                 * We need to have information about note_buf_t which is used to
                 * hold ELF note containing registers and status of the thread
                 * that panic'd.
                 */
      =====>    STRUCT_SIZE_INIT(note_buf, "note_buf_t");

                STRUCT_SIZE_INIT(elf_prstatus, "elf_prstatus");
                MEMBER_OFFSET_INIT(elf_prstatus_pr_pid, "elf_prstatus",
                                   "pr_pid");
                MEMBER_OFFSET_INIT(elf_prstatus_pr_reg, "elf_prstatus",
                                   "pr_reg");
                break;

But the next STRUCT_SIZE_INIT() for "elf_prstatus" apparently never got completed.  

In any case, it ended up in open_tmpfile2():

  $ tail teraterm.log
  GETBUF(128 -> 0)
  FREEBUF(0)
  GETBUF(128 -> 0)
  FREEBUF(0)
  GETBUF(128 -> 0)
  FREEBUF(0)
  
  crash: cannot open secondary temporary file
  
  1|shell at android:/system # 
  $ 


Although it's not clear how it's ending up in open_tmpfile2(), 
it's certainly of interest that the tmpfile() call is failing:
  
  void
  open_tmpfile2(void)
  {
          if (pc->tmpfile2)
                  error(FATAL, "recursive secondary temporary file usage\n");
  
          if ((pc->tmpfile2 = tmpfile()) == NULL)
                  error(FATAL, "cannot open secondary temporary file\n");
  
          rewind(pc->tmpfile2);
  }

The man page for tmpfile() shows these reasons:
  
  RETURN VALUE
         The tmpfile() function returns a stream descriptor, or NULL if a unique
         filename cannot be generated or the unique file cannot  be  opened.  In
         the latter case, errno is set to indicate the error.
  
  ERRORS
         EACCES Search permission denied for directory in file’s path prefix.
  
         EEXIST Unable to generate a unique filename.
  
         EINTR  The call was interrupted by a signal.
  
         EMFILE Too many file descriptors in use by the process.
  
         ENFILE Too many files open in the system.
  
         ENOSPC There was no room in the directory to add the new filename.
  
         EROFS  Read-only filesystem.
  
A couple things you might try:

  (1) Put a perror() after the tmpfile() call to determine which errno 
      is being returned.
  (2) Set "pc->flags |= DROP_CORE;" prior to the tmpfile() call.

Like this:

  void
  open_tmpfile2(void)
  {
          if (pc->tmpfile2)
                  error(FATAL, "recursive secondary temporary file usage\n");

+         pc->flags |= DROP_CORE;
-         if ((pc->tmpfile2 = tmpfile()) == NULL)
+         if ((pc->tmpfile2 = tmpfile()) == NULL) {
+               perror("tmpfile");
                error(FATAL, "cannot open secondary temporary file\n");
+      }
      pc->flags &= ~DROP_CORE;

          rewind(pc->tmpfile2);
  }

Then get a backtrace by running gdb on the resultant core file, or just
run the whole session from gdb.

Dave
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/crash-utility/attachments/20120813/682e91b6/attachment.htm>


More information about the Crash-utility mailing list