[Crash-utility] (pvops 2.6.32.21) crash: cannot read/find cr3 page

Dave Anderson anderson at redhat.com
Mon Sep 13 15:59:44 UTC 2010


----- "tom anderson" <xentoma at hotmail.com > wrote:

> if I  use a pvops domU kernel version 2.6.32.18 crash works fine.  However if I
> use a pvops domU kernel version 2.6.32.21 I get the error messages:
> 
>    crash: cannot find mfn 874307 (0xd5743) in page index  
>    crash: cannot read/find cr3 page
>
> Any suggestions as to what is wrong?

Hi Tom,
 
I can't really give you specific suggestions as to what is wrong,
but at least tell what the crash utility is encountering.

I suppose there's good news and bad news concerning this issue,
the good news being that it worked OK with 2.6.32.18, which is 
fairly close to your failing 2.6.32.21.  Since I've done very little
with Xen support since Red Hat dropped Xen development beyond our 
RHEL5 2.6.18-era release, it's always good to hear that it actually 
still worked with a 2.6.32.18 kernel.  I imagine eventually something
will break in the future, and at that time I may likely require outside
assistance to keep Xen support in place.

Anyway, that all being said, in your failure case, here are the issues
at hand.  The header shows this:

          xc_core:
                       header:
                    xch_magic: f00febed (XC_CORE_MAGIC)
                 xch_nr_vcpus: 7
                 xch_nr_pages: 521792 (0x7f640)
              xch_ctxt_offset: 1896 (0x768)
             xch_index_offset: 2137305088 (0x7f64b000)
             xch_pages_offset: 45056 (0xb000)
                    elf_class: ELFCLASS64
            elf_strtab_offset: 2145653760 (0x7fe41400)
               format_version: 0000000000000001
           shared_info_offset: 38072 (0x94b8)

The "xch_nr_pages" indicates that the domU vmlinux kernel has 521792 
pseudo-physical pages assigned to it, where those pseudo-physical pages
are backed by the Xen hypervisor by machine pages, which are the "real" 
physical pages.  And so when the crash utility needs to access a 
pseudo-physical page used by a domU kernel, that pseudo-physical page 
needs to be translated to the actual machine physical page that backs it,
and then that physical page needs to be found in the dumpfile.  The PFN
(page frame number) of the pseudo-physical pages are call "pfns" and the
PFN of the machine pages are called "mfns" or "gmfns".

To match a pfn with its corresponding mfn, the kdump operation dumps an 
array of pfn-to-mfn pairs in the vmcore's ".xen_p2m" section, this taken from
http://www.sfr-fresh.com/unix/misc/xen-4.0.1.tar.gz:a/xen-4.0.1/docs/misc/dump-core-format.txt

 ".xen_p2m" section
        name            ".xen_p2m"
        type            SHT_PROGBITS
        structure       array of struct xen_dumpcore_p2m
                        struct xen_dumpcore_p2m {
                            uint64_t    pfn;
                            uint64_t    gmfn;
                        };

        description
                This elements represents the frame number of the page
                in .xen_pages section.
                        pfn:    guest-specific pseudo-physical frame number
                        gmfn:   machine physical frame number
                The size of arrays is stored in xch_nr_pages member of header
                note descriptor in .note.Xen note section.
                The entryies are stored in pfn-ascending order.
                This section must exist when the domain is non auto
                translated physmap mode. Currently x86 paravirtualized domain.

The "pfn" value associated with the "gmfn" value, is in turn used
as an index into an array of actual pages in the dumpfile, which is
found at the "xch_pages_offset" at 45056 (0xb000).

The start of the index array is found in the dumpfile at the "xch_index_offset" 
at 2137305088 (0x7f64b000), and ends at the "elf_strtab_offset" at 2145653760
(0x7fe41400).  Accordingly, if you subtract 2137305088 from 2145653760,
the array of xen_dumpcore_p2m structures is 8348672 bytes, which when
divided by the size of the data structure (16), it equals the value of
"xch_nr_pages", or 521792.

Anyway, the very first read attempt requires the crash utility to do a 
one-time-only recreation of the kernel's "p2m_top" array (pvops kernels only),
and in so doing needs to first read the page found in the hypervisor's cr3
register, which contains a machine address:

<readmem: ffffffff81614800, KVADDR, "kernel_config_data", 32768, (ROE), 2fed090>
        addr: ffffffff81614800  paddr: 1614800  cnt: 2048
    GETBUF(248 -> 0)
    FREEBUF(0)
    MEMBER_OFFSET(vcpu_guest_context, ctrlreg): 4984
    ctrlreg[0]: 80050033
    ctrlreg[1]: d5742000
    ctrlreg[2]: 0
    ctrlreg[3]: d5743000
    ctrlreg[4]: 2660
    ctrlreg[5]: 0
    ctrlreg[6]: 0
    ctrlreg[7]: 0
    crash: cannot find mfn 874307 (0xd5743) in page index
   
    crash: cannot read/find cr3 page

It contained a machine address of d5743000, which when shifted-right equates 
to an PFN (or "mfn") of 874307 (0xd5743).  It then walked through the index
array of xen_dumpcore_p2m structures in the dumpfile, looking for the one that 
contains that "gmfn" value.  

But for whatever reason, it could not find it.  That being the
case, there's no way it can continue.

I can't really help much more than that.  The function that
walks through the array is xc_core_mfn_to_page() in xendump.c.
It prints the "cannot find mfn ..." message, and returns back
to the x86_64_pvops_xendump_p2m_create() function in x86_64.c,
which prints the final, fatal, "cannot read/find cr3 page"
message.

If you capture the same type of debug output with the earlier 
kernel, you should see it get to the point above and continue
on from there.

Dave




More information about the Crash-utility mailing list