[Crash-utility] Problem with disassembling functions that use BUG()

Mon Mar 6 18:54:37 UTC 2006

Gerard Snitselaar wrote:

> Using an AS4 i386 based system with a 2.6.9-22ELsmp kernel. Currently BUG()
> gets defined as the following:
>
> #define BUG()                           \
>  __asm__ __volatile__(  "ud2\n"         \
>                         "\t.word %c0\n" \
>                         "\t.long %c1\n" \
>                          : : "i" (__LINE__), "i" (__FILE__))
>
> So after the ud2 opcode it places __LINE__ and __FILE__ in the next 6 bytes.
> The trap handler for ud2 uses these to print a message saying where BUG() was
> used.
>
> Crash has no knowledge of this convention so it thinks the byte after the ud2
> opcode is the start of the next instruction. This results in bad disassemblies
> being generated. The example where I ran into it was flush_tlb_others() .
>
> Below I have included the source for flush_tlb_others , the disassembly from
> crash, the raw code for flush_tlb_others, and what I think the disassembly
> should be if one takes into account the convention used in BUG(). What
> initially made me suspicous was that I didn't see "call <_spin_lock>"
> anywhere, and the offsets for jumps didn't line up with instructions. From
> what I can tell this would probably have to be dealt with in print_insn() in
> gdb/opcodes/i386-dis.c . Not sure how to go about it, or what should be done
> since newer kernels allow you to configure whether those bytes get encoded
> after the ud2 opcode with CONFIG_DEBUG_VERBOSE.
>
> Any ideas on solving this?

Not off-hand...

It's a "known issue" that I've gotten around it by, starting at the virtual addresses
after the ud2a, running multiple new "dis" attempts at each successive address
until it makes sense.  It's strange, sometimes nothing really needs to be done
because it shows just a couple instructions as bogus, but then "gets back on
track", while other times it never gets back in sync with the real instructions.

Ideally it should be handled in the gdb code, which is what's screwing it up.
Crash is just taking the gdb disassembly output and running it through a
per-processor-type filter function to improve what you get from gdb.

Dave