[Crash-utility] [crash-5.0.1] glibc detected: double free or corruption (!prev)

Dave Anderson anderson at redhat.com
Tue Feb 23 14:10:05 UTC 2010


----- "Hedi Berriche" <hedi at sgi.com> wrote:

> Context:
> 
> - crash-5.0.1
> - glibc 2.4
> - vmcore produced by x86_64 sles11 2.6.27.19-5-default
> 
> Problem:
> 
> crash> mod -s xfs /usr/people/hedi/xfs.ko.debug
> mod: xfs: last symbol is not _MODULE_END_xfs?
> *** glibc detected *** /tr/x86_64/bin/crash: double free or corruption
> (!prev): 0x0000000001558760 ***
>    <segmentation violation in gdb>
> mod: /usr/people/hedi/xfs.ko.debug
>      gdb add-symbol-file command failed
> 
> hangs solid there and has to be killed with SIGKILL.
> 
> Grabbing a core reveals the following:
> 
> (gdb) bt f
> #0  0x00002b628cd0ebb5 in raise () from /lib64/libc.so.6
> #1  0x00002b628cd0ffb0 in abort () from /lib64/libc.so.6
> #2  0x00002b628cd4a340 in malloc_printerr () from /lib64/libc.so.6
> #3  0x00000000005454af in parse_exp_in_context (stringptr=0x400000000,
> block=<value optimized out>, comma=<value optimized out>,
> void_context_p=0, out_subexp=0x7b4760)
>     at parse.c:1101
>         except = {reason = RETURN_ERROR, error = GENERIC_ERROR,
> message = 0x1c790a0 "Dwarf Error: Could not find abbrev number 188 [in
> module /usr/people/hedi/xfs.ko.debug]"}
>         old_chain = (struct cleanup *) 0x0
>         subexp = <value optimized out>
> #4  0x000000060000000b in ?? ()
> #5  0x0000000000000000 in ?? ()
> 
> (gdb) f 3
> #3  0x00000000005454af in parse_exp_in_context (stringptr=0x400000000,
> block=<value optimized out>, comma=<value optimized out>,
> void_context_p=0, out_subexp=0x7b4760)
>     at parse.c:1101
> 1101              xfree (expout);
> 
> (gdb) list
> 1096        }
> 1097      if (except.reason < 0)
> 1098        {
> 1099          if (! in_parse_field)
> 1100            {
> 1101              xfree (expout);
> 1102              throw_exception (except);
> 1103            }
> 1104        }
> 1105
> 
> Not sure (yet) whether the error
> 
>     mod: xfs: last symbol is not _MODULE_END_xfs?
>     Dwarf Error: Could not find abbrev number 188 [in module /usr/people/hedi/xfs.ko.debug]
> 
> is a problem in crash or in the xfs.ko.debug objfile but that's another story,
> the problem here is that crash shouldn't crash.
>
> 
> FWIW, this problem is most definitely a regression, indeed crash version
> 4.-8.11, for example, fails to load the objfile, with exactly the same error
> message, with the notable difference that it does *not* crash.

Agreed on all counts.  It's crashing now because of the gdb-7.0 integration,
and the attached patch should fix that.

As far as the embedded "add-symbol-file" failure to load the module, you're
right, that's another issue, and what I can suggest is this:

  crash> set debug 1
  crash> mod -s xfs /usr/people/hedi/xfs.ko.debug

and you will see the full "add-symbol-file" gdb command string that's failing.
For that matter you can take that full string, remove crash from the picture
entirely, and just enter it into a gdb session:

  $ gdb 
  ...
  add-symbol-file arg arg arg...

It looks like some kind of Dwarf issue though, and I can't help with that.
However, at least on a RHEL environment, the argument to the mod command 
should be the stripped module.ko file, and the module.ko.debug file gets
found automatically, and the two pieces put together.  In other words,
taking the "ext3" module, my RHEL5 environment has:

  /lib/modules/2.6.18-128.el5/kernel/fs/ext3/ext3.ko
  /usr/lib/debug/lib/modules/2.6.18-128.el5/kernel/fs/ext3/ext3.ko.debug

And when it gets loaded, the base "ext3.ko" file is used as the internal
argument to the gdb "add-symbol-file" command:

crash> mod -s ext3
     MODULE       NAME          SIZE  OBJECT FILE
ffffffff8806ae00  ext3        168017  /lib/modules/2.6.18-128.el5/kernel/fs/ext3/ext3.ko 
crash>

I wonder if you would still see the same issue if you used the base "xfs.ko"
file instead of "xfs.ko.debug"?
  
For the first time I saw one of those (harmless) "last symbol is not _MODULE_END_xxx"
messages on a 2.6.32 x86 kernel the other day.  I'll look into that.

And lastly:

> P.S. The "last symbol is not _MODULE_END_<modulename>" has been reported
>      back in Jan 2009 (albeit with the difference that crash would load the
>      objfile despite the error message)
>            
> https://www.redhat.com/archives/crash-utility/2009-January/msg00070.html
> 
>      but I am not sure the root cause was identified back then, or at least I am
>      failing to find, in the list archives, any proof of that.

I don't know what the deal was with that...

Dave

-------------- next part --------------
A non-text attachment was scrubbed...
Name: symbols.patch
Type: text/x-patch
Size: 488 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/crash-utility/attachments/20100223/ba31fe22/attachment.bin>


More information about the Crash-utility mailing list