[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [libvirt] [PATCH] nodeinfo: Add workaround if kernel reports bogous numa topology.



On 10/30/12 21:08, Eric Blake wrote:
On 10/30/2012 05:07 AM, Peter Krempa wrote:
Forwarding the response from George-Cristian Bîrzan who initialy
reported that:

George-Cristian has a bunch of identical machines where some report
having 4 NUMA cells and some just 1:

[...]

I did it on two hosts, one with 1 NUMA cell, one with 4 (as I said before,
they both only report 12 cores though):

http://birzan.org/proc1.png
http://birzan.org/proc4.png

Both bitmaps show all 24 cores, so hwloc is able to read sysfs and
determine the existence of 2 sockets with 12 nodes each, and where the
12 nodes are numbered 0-5 twice according to which bank of cache they
are tied to.  Which version of libvirt is this tested on where libvirt
was only reporting 12 cores, because I thought we already patched that
with commit 80533ca in 0.10.0.  That is, I think proc1.png should result in:

Version of libvirt in case of proc1.png is 0.10.2 from fedora 17.


$ virsh nodeinfo
     CPU model:           x86_64
     CPU(s):              24
     CPU frequency:       2200 MHz
     CPU socket(s):       2
     Core(s) per socket:  12
     Thread(s) per core:  1
     NUMA cell(s):        1
     Memory size:         8047272 KiB

Yes, that's what we should report in this case. The problem here is that (as the hardware topology is identical to the second image in fact) some of the cores have duplicate ID's and we are not able to detect that correctly.

We would need a third level of hierarchy, where we would detect trheads to be able to detect dupicate id's in a perfect manner.


and proc4.png would _ideally_ result in:

$ virsh nodeinfo
     CPU model:           x86_64
     CPU(s):              24
     CPU frequency:       2200 MHz
     CPU socket(s):       2
     Core(s) per socket:  12
     Thread(s) per core:  1
     NUMA cell(s):        4
     Memory size:         8047272 KiB


Unfortunately, the output is : number of NUMA nodes, number of sockeds per NUMA node, number of cores per socked and number of threads per core. So the correct output should be:

4 nodes, 1 socket, 6 cores, 1 thread

except that virNodeGetInfo() is constrained by backwards compatibility
to report 'nodes' == 1 on situations where sockets per node is not
integral (and here, half a socket per node is not integral), so it
_actually_ would give the same data as proc1.png.

or we can use this when somebody is providing inaccurate information.



------

I think we should take this patch as it resolves this case. The data
reported by kernel looks OK and the kernel probably trusts that
everything is OK.

At any rate, I'm looking again at the patch, and the proposed
linux-test7/node data indeed shows a single NUMA cell with 24 cores
(matching up to the proc1.png image).

I think the CPU _is_ reporting the complete NUMA topology through sysfs,
but that we are probably consolidating information from the wrong files
and therefore getting confused.

It is, but there's a problem on other levels as the machine should be identical to the one with 4 nodes. But the problem might be in the machine firmware or a ton of other places.


I guess I need to install the linux-test7 files, then step through the
code to see what is actually happening.

To do this, you just need to patch one line in nodeinfo.c that defines the default path. That's the way I'm doing it while testing.


Also, what does the 'virsh capabilities' report for the <topology>
section?  Whereas 'virsh nodeinfo' is constrained by back-compat to give
a lame answer for number of NUMA cells, at least 'virsh capabilities'
should be showing a reasonable representation of the machine's topology.


capabilities were showing the topology as described in the picture although it's not correct on that level either.


Peter



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]