[Crash-utility] dom0 analysis for IA64

Fri May 11 17:57:29 UTC 2007

Isaku Yamahata wrote:

> Hi Dave.
> I think I can explain it.
>
> Sometimes xen needs to share pages with dom0.
> For example shared_info, grant table pages, another domain's pages
> and etc.
> In such a case, Xen/IA64 puts those pages in the dom0 pseudo physical
> addresses space, i.e. it updates dom0 p2m table, thus dom0 can
> access those pages.
> Pseudo physical addresses are predefined or given by xen or dom0.
> Currently shared_info is assigned at pseudo physical address
> of 1UL << 40 = 1TB.
> This corresponds to the following entry.
> >     f00000007d8b0080:  000000007f428000 0000000000000000   ..B.............
>
> Dom0 controls devices so that it needs to access I/O area.
> For that purpose, dom0 p2m table has the entry which points I/O area
> such that dom0 pseudo physical address = machine address.
> I guess that the following entry corresponds to I/O area.
> >     f00000007d8b07f0:  0000000000000000 000000007bed4000   ......... at .{....
> In order to confirm this, The native linux's /proc/iomem is necessary.
>
> thanks.
>

OK, thanks for that explanation...

It *still* seems to be a huge waste of memory.  Taking the
example dump, the 1GB of "normal" memory requires 32 p2m_mfn
values for address translation, plus -- if I understand you
correctly -- 1 for the shared_info, plus 1 for the I/O area.
That's a total of 34 8-byte values, or 272 bytes.  Whereas
this first patch uses 524288 entries, or 4MB of memory.
It seems to me there should be a better way to handle it,
even if those two particular pseudo-physical regions are
"special-cased" for ia64.

But, in any case, Itsuro, can you do what is possible with
your patch, and re-submit it?

Thanks guys,
  Dave

>
> On Fri, May 11, 2007 at 10:02:39AM -0400, Dave Anderson wrote:
> > Itsuro ODA wrote:
> >
> >     Hi Dave,
> >
> >     > This all sounds good, and I agree that the p2m_mfn should
> >     > be added to the ia64 XEN_ELFNOTE_CRASH_INFO.
> >     >
> >     > However, there's something incorrect in your calculation of
> >     > "xkd->p2m_frames" in your ia64_xen_kdump_p2m_create() implementation.
> >     > It looks like it should be 32, but it's set to 524288.  As a result
> >     > that wastes a lot of memory, and "help -n" is pretty much unusable
> >     > since wants to dump all ~512k entries:
> >
> >     This is because IA64's pseudo-physical memory map (domain on xen
> >     specific).
> >
> >     phys-to-machine mapping is managed as 3-level page table.
> >     pgd looks like:
> >     -------------------------------------------------------------
> >     crash> doms
> >        DID       DOMAIN      ST T  MAXPAGE  TOTPAGE VCPU     SHARED_I
> >     P2M_MFN
> >       32753 f000000007dac080 ?? O     0        0      0          0
> >     ----
> >       32754 f000000007ff0080 ?? X     0        0      0          0
> >     ----
> >       32767 f000000007ff4080 ?? I     0        0      1          0
> >     ----
> >     >*    0 f000000007da4080 ?? 0   10000    f986     1  f000000007d90000
> >     1f62c
> >
> >     crash> domain f000000007da4080
> >     struct domain {
> >       domain_id = 0,
> >       shared_info = 0xf000000007d90000,
> >     ...
> >       arch = {
> >         mm = {
> >           pgd = 0xf00000007d8b0000
> >         },
> >     ...
> >     crash> rd 0xf00000007d8b0000 256
> >     f00000007d8b0000:  000000007c8d8000 0000000000000000   ...|............
> >     f00000007d8b0010:  0000000000000000 0000000000000000   ................
> >     f00000007d8b0020:  0000000000000000 0000000000000000   ................
> >     f00000007d8b0030:  0000000000000000 0000000000000000   ................
> >     f00000007d8b0040:  0000000000000000 0000000000000000   ................
> >     f00000007d8b0050:  0000000000000000 0000000000000000   ................
> >     f00000007d8b0060:  0000000000000000 0000000000000000   ................
> >     f00000007d8b0070:  0000000000000000 0000000000000000   ................
> >     f00000007d8b0080:  000000007f428000 0000000000000000   ..B.............
> >     f00000007d8b0090:  0000000000000000 0000000000000000   ................
> >     ...
> >     f00000007d8b07c0:  0000000000000000 0000000000000000   ................
> >     f00000007d8b07d0:  0000000000000000 0000000000000000   ................
> >     f00000007d8b07e0:  0000000000000000 0000000000000000   ................
> >     f00000007d8b07f0:  0000000000000000 000000007bed4000   ......... at .{....
> >     -------------------------------------------------------------------------
> >     (256 * 2048 = 524288)
> >
> >     It is certain that (pseudo-)physical memory "256GB-" and "-4TB" exits.
> >     These area are shared by domain-0 and xen hypervisor.
> >     These area should be accessed in dom0's analysis session.
> >
> >     (I said:)
> >     > > But this patch is a bit tricky. And the memory usage is
> >     > > large if the machine memory layout is sparse.
> >
> >     It is wrong. This should be "the memory usage is large if
> >     pseudo-physical memory layout is sparse."
> >     And it is always sparse actually...
> >
> >     Thanks.
> >
> >
> > Hi Itsuro,
> >
> > I now understand the difference in the 3rd-level p2m
> > frame contents being page table entries instead of mfn
> > values.
> >
> > However, I still do not understand what you mean regarding
> > the concept of the pseudo-physical memory being "sparse".
> > Looking at the dumpfile again, it appears to have the same
> > type of flat pseudo-physical memory layout just like the
> > other architectures.
> >
> > Dom0 has ~1GB of pseudo-physical memory:
> >
> > crash> sys
> >       KERNEL: ../20070510-sample-dump-2/vmlinux-xen-ia64
> >     DUMPFILE: ../20070510-sample-dump-2/vmcore.tiger.iomem_machine
> >         CPUS: 1
> >         DATE: Mon May  7 04:07:43 2007
> >       UPTIME: 00:01:47
> > LOAD AVERAGE: 0.11, 0.04, 0.01
> >        TASKS: 21
> >     NODENAME: (none)
> >      RELEASE: 2.6.18-xen
> >      VERSION: #3 SMP Mon May 7 13:14:41 JST 2007
> >      MACHINE: ia64  (1296 Mhz)
> >       MEMORY: 1 GB
> >        PANIC: "SysRq : Trigger a crashdump"
> > crash>
> >
> > And as far as dom0's VM is concerned, its memory map only knows
> > about the 64512 pages in DMA zone 0:
> >
> > crash> kmem -n
> > NODE    SIZE      PGLIST_DATA       BOOTMEM_DATA       NODE_ZONES
> >   0    64512    a000000100482f80  a000000100608950  a000000100482f80
> >                                                     a000000100483500
> >                                                     a000000100483a80
> >                                                     a000000100484000
> >     MEM_MAP       START_PADDR  START_MAPNR
> > e0000000010b0000       0            0
> >
> > ZONE  NAME         SIZE       MEM_MAP      START_PADDR  START_MAPNR
> >   0   DMA         64512  e0000000010b0000            0            0
> >   1   DMA32           0                 0            0            0
> >   2   Normal          0                 0            0            0
> >   3   HighMem         0                 0            0            0
> > crash>
> >
> > So the "end of memory" would be just below 1GB:
> >
> > crash> eval 64512 * 16k
> > hexadecimal: 3f000000  (1008MB)
> >     decimal: 1056964608
> >       octal: 7700000000
> >      binary: 0000000000000000000000000000000000111111000000000000000000000000
> > crash>
> >
> > So, with respect to dom0, how would it ever go beyond 32
> > p2m_frames?  Putting a debug printf in xen_kdump_p2m, it
> > shows this:
> >
> > crash> rd -p 3f000000
> > xen_kdump_p2m: mfn_idx for 3f000000: 31
> >         3f000000:  0000000000000000                    ........
> > crash>
> >
> > So that shows that there only needs to be 32 p2m_frames
> > for accessing all of dom0 pseudo-physical memory.
> >
> > But it also shows that you are allowing access to memory
> > that is *beyond* the end of dom0 pseudo-physical memory,
> > since 3f000000 should not be readable.  There is not a
> > page structure associated with 3f000000:
> >
> > crash> kmem -p | tail
> > e000000001421dd0 3efd8000      -------       -----   1 0
> > e000000001421e08 3efdc000      -------       -----   1 0
> > e000000001421e40 3efe0000      -------       -----   1 60
> > e000000001421e78 3efe4000      -------       -----   1 60
> > e000000001421eb0 3efe8000      -------       -----   1 60
> > e000000001421ee8 3efec000      -------       -----   1 60
> > e000000001421f20 3eff0000      -------       -----   2 0
> > e000000001421f58 3eff4000      -------       -----   1 80
> > e000000001421f90 3eff8000      -------       -----   1 80
> > e000000001421fc8 3effc000      -------       -----   1 80
> > crash>
> >
> > By doing few other "rd -p" commands, I see that you seem
> > to be allowing memory accesses based upon what's in the ELF
> > header PT_LOAD segments, which are "machine" physical memory
> > descriptors:
> >
> > crash> help -n | grep phys_end
> >                phys_end: 1000
> >                phys_end: 7000
> >                phys_end: 9000
> >                phys_end: 82000
> >                phys_end: 85000
> >                phys_end: a0000
> >                phys_end: 4000000
> >                phys_end: 81b3000
> >                phys_end: ffc0000
> >                phys_end: 10000000
> >                phys_end: 7ab06000
> >                phys_end: 7c8d2000
> >                phys_end: 7c92e000
> >                phys_end: 7c938000
> >                phys_end: 7c97e000
> >                phys_end: 7cdf6000
> >                phys_end: 7cdfc000
> >                phys_end: 7ce2a000
> >                phys_end: 7d001000
> >                phys_end: 7d002000
> >                phys_end: 7d044000
> >                phys_end: 7d045000
> >                phys_end: 7d37e000
> >                phys_end: 7d700000
> >                phys_end: 7d77e000
> >                phys_end: 7d8b4000
> >                phys_end: 7f980000
> >                phys_end: 7fa00000
> >                phys_end: 7feda000
> > crash>
> >
> > So it appears that the physical machine running the
> > dom0 and hypervisor has almost 2GB of "real" physical
> > memory.  And if I try to read the limit address of
> > 7feda000, it fails:
> >
> > crash> rd -p 7feda000
> > xen_kdump_p2m: mfn_idx for 7feda000: 63
> > rd: read error: physical address: 7feda000  type: "64-bit PHYSADDR"
> > crash>
> >
> > But the last page of physical memory can be read:
> >
> > crash> rd -p 7fed9000
> > xen_kdump_p2m: mfn_idx for 7fed9000: 63
> >         7fed9000:  000000007f9da0a0                    ........
> > crash>
> >
> > "rd -p" is supposed to read pseudo-physical memory in xen
> > kernels, but it seems to be allowing reads based upon the
> > PT_LOAD segment contents?  In other words, it seems to
> > be mixing dom0 pseudo-physical memory and the system's
> > machine memory, because 7fed9000 is not a legitimate dom0
> > pseudo-physical address.
> >
> > (And even with that happening, the maximum p2m_frame index
> > is still only 63 -- how can it ever be 512k with respect
> > to dom0's pseudo-physical memory?)
> >
> > So I'm sorry, but this does not make sense to me...
> >
> > Dave
> >
> >
> >
>
> > --
> > Crash-utility mailing list
> > Crash-utility at redhat.com
> > https://www.redhat.com/mailman/listinfo/crash-utility
>
> --
> yamahata
>
> --
> Crash-utility mailing list
> Crash-utility at redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/crash-utility/attachments/20070511/4654c864/attachment.htm>