[libvirt] [PATCH] nodeinfo: Add workaround if kernel reports bogous numa topology.

Peter Krempa pkrempa at redhat.com
Tue Oct 30 21:14:10 UTC 2012


On 10/30/12 21:08, Eric Blake wrote:
> On 10/30/2012 05:07 AM, Peter Krempa wrote:
>> Forwarding the response from George-Cristian Bîrzan who initialy
>> reported that:
>>
>> George-Cristian has a bunch of identical machines where some report
>> having 4 NUMA cells and some just 1:
>>
>> [...]
>>
>> I did it on two hosts, one with 1 NUMA cell, one with 4 (as I said before,
>> they both only report 12 cores though):
>>
>> http://birzan.org/proc1.png
>> http://birzan.org/proc4.png
>
> Both bitmaps show all 24 cores, so hwloc is able to read sysfs and
> determine the existence of 2 sockets with 12 nodes each, and where the
> 12 nodes are numbered 0-5 twice according to which bank of cache they
> are tied to.  Which version of libvirt is this tested on where libvirt
> was only reporting 12 cores, because I thought we already patched that
> with commit 80533ca in 0.10.0.  That is, I think proc1.png should result in:

Version of libvirt in case of proc1.png is 0.10.2 from fedora 17.

>
> $ virsh nodeinfo
>      CPU model:           x86_64
>      CPU(s):              24
>      CPU frequency:       2200 MHz
>      CPU socket(s):       2
>      Core(s) per socket:  12
>      Thread(s) per core:  1
>      NUMA cell(s):        1
>      Memory size:         8047272 KiB

Yes, that's what we should report in this case. The problem here is that 
(as the hardware topology is identical to the second image in fact) some 
of the cores have duplicate ID's and we are not able to detect that 
correctly.

We would need a third level of hierarchy, where we would detect trheads 
to be able to detect dupicate id's in a perfect manner.

>
> and proc4.png would _ideally_ result in:
>
> $ virsh nodeinfo
>      CPU model:           x86_64
>      CPU(s):              24
>      CPU frequency:       2200 MHz
>      CPU socket(s):       2
>      Core(s) per socket:  12
>      Thread(s) per core:  1
>      NUMA cell(s):        4
>      Memory size:         8047272 KiB
>

Unfortunately, the output is : number of NUMA nodes, number of sockeds 
per NUMA node, number of cores per socked and number of threads per 
core. So the correct output should be:

4 nodes, 1 socket, 6 cores, 1 thread

> except that virNodeGetInfo() is constrained by backwards compatibility
> to report 'nodes' == 1 on situations where sockets per node is not
> integral (and here, half a socket per node is not integral), so it
> _actually_ would give the same data as proc1.png.

or we can use this when somebody is providing inaccurate information.
>
>
>>
>> ------
>>
>> I think we should take this patch as it resolves this case. The data
>> reported by kernel looks OK and the kernel probably trusts that
>> everything is OK.
>
> At any rate, I'm looking again at the patch, and the proposed
> linux-test7/node data indeed shows a single NUMA cell with 24 cores
> (matching up to the proc1.png image).
>
> I think the CPU _is_ reporting the complete NUMA topology through sysfs,
> but that we are probably consolidating information from the wrong files
> and therefore getting confused.

It is, but there's a problem on other levels as the machine should be 
identical to the one with 4 nodes. But the problem might be in the 
machine firmware or a ton of other places.

>
> I guess I need to install the linux-test7 files, then step through the
> code to see what is actually happening.

To do this, you just need to patch one line in nodeinfo.c that defines 
the default path. That's the way I'm doing it while testing.

>
> Also, what does the 'virsh capabilities' report for the <topology>
> section?  Whereas 'virsh nodeinfo' is constrained by back-compat to give
> a lame answer for number of NUMA cells, at least 'virsh capabilities'
> should be showing a reasonable representation of the machine's topology.


capabilities were showing the topology as described in the picture 
although it's not correct on that level either.


Peter





More information about the libvir-list mailing list