[Crash-utility] invalid regs display in bt

Richard J Moore richardj_moore at uk.ibm.com
Tue Sep 25 21:58:44 UTC 2007


I've been puzzling over why the regs formatted with a backtrace on an IA32 
dump are invalid. Here's what I mean:

PID: 2692   TASK: f4656630  CPU: 0   COMMAND: "rmmod"
 #0 [f463ce54] crash_kexec at c044a1f7
 #1 [f463ce9c] die at c040651a
 #2 [f463ced4] do_page_fault at c0603107
 #3 [f463cf14] error_code (via page_fault) at c060190a
    EAX: 00000018  EBX: f8b43400  ECX: f8b4304f  EDX: 00200000 
    DS:  007b      ESI: 00000000  ES:  007b      EDI: 00000000
    SS:  304f      ESP: f8b4302b  EBP: f463c000
    CS:  0060      EIP: f8b43004  ERR: ffffffff  EFLAGS: 00210286 


They are supposed to represent a valid set of regs that are presented to 
do_page_fault, which I presume are meant to be valid at the time the 
exception occurred.
Of they can never be a set of valid regs for the simple reason that the 
CPL is 0 (CS=60) and the RPL of SS is 3, which is an automatic GPF.
Since I manufactured the exception that caused this dump, by causing an 
unrecoverable page fault in ring 0, I known the CS is correct but SS is 
bogus. 
Furthermore the the error code (ERR), which is stored by the processor as 
part of the exception stack frame uses only bits 0-2 for page faults and 
at most bits 0-15 for other exceptions, the unused bit positions are zero. 
So ERR is also bogus.

On looking at the code in entry.S at page_fault and the other exception 
entry points I see no attempt to save regs to create a pt_regs struct. The 
fact that do_page_fault takes pt_regs as the first arg is a hack to get at 
CS:EIP and SS:ESP at the time of exception. Furthermore error_code loads 
the exception error code into edx then wipes it out from the stack by 
storing -1 into this location. I can't actually see a good reason for 
wiping out the error code. By convention exceptions and interrupts have a 
-ve integer stored at the error-code location to distinguish them from 
system calls, but I don't think this is used. signal.c seems to be the 
only place to look for an error code >=0 but I don't see an exception 
affects signal.c 

Can anyone confirm whether setting the error code to -1 is essential. If 
it isn't then I think we should consider leaving it in place.


The long and short of it is: the only thing that has any meaning is CS, 
EIP and EFLAGS. All of which are saved by the processor.  SS and ESP are 
only saved when the exception occurred at a privilege level >0 but these 
can never generate a panic. 

I'd recommend that we change the bt output to format only the three valid 
regs (possibly SS and ESP, if CPL at time of exception >0). Is there any 
reason why this shouldn't be changed?

Richard







Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU





-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/crash-utility/attachments/20070925/261f3d0e/attachment.htm>


More information about the Crash-utility mailing list