[vfio-users] vga passthrough on server boards

Alex Williamson alex.williamson at redhat.com
Wed Oct 12 18:08:10 UTC 2016


On Wed, Oct 12, 2016 at 11:50 AM, Bronek Kozicki <brok at spamcop.net> wrote:

> On 12/10/2016 18:04, Alex Williamson wrote:
> > On Wed, Oct 12, 2016 at 10:51 AM, Ethan Thomas <thomas.ethan at gmail.com
> > <mailto:thomas.ethan at gmail.com>> wrote:
> >
> > I've been using a SuperMicro X8DTH-iF for quite some time with no
> > problems. However it's worth noting that with some generations of
> > multi-cpu boards the PCI-E lanes and ram may be associated with a
> > specific CPU, so you may need to adjust which slots and cores you
> > associate with a particular VM for best performance.
> >
> >
> > I would go so far as to say this is true of any modern multi-socket
> > system, it's called NUMA, Non-Uniform Memory Access. You can use tools
> > like lstopo to identify the locality of memory, devices, and
> > processors. Using memory and CPU from the correct node is important for
> > an VM, and an assigned device should be an extra pull towards the I/O
> > local node.
>> Hi Alex
>
> Memory locality is one thing I totally forgot about when setting up my
> VMs. Do you have any example how to reserve huge pages on a specific
> node via sysctl, and refer to it later in libvirt configuration?
> Currently I only use this:
>
> ~ # cat /etc/sysctl.d/80-hugepages.conf
> # Reserve this many 2MB pages for virtual machines
> vm.nr_hugepages = 28000
>

It's been a while since I've actively played with this, but I believe part
of the key is to configure hugepages via sysfs, not /proc/sys, which sysctl
does.  IIRC, /proc/sys/vm/nr_hugepages will by default do round-robin
allocation between nodes.  Instead you want to
use /sys/devices/system/node/node#/hugepages/ (replace # with node
number).  This allows you to specifically allocate per node and gives you
selection of what hugepage size to allocate.


> ~ # grep hugepages /etc/mtab
> hugetlbfs /dev/hugepages hugetlbfs rw,relatime 0 0
>
> ~ # grep hugepages /etc/libvirt/qemu.conf | grep -vE "^#"
> hugetlbfs_mount = "/dev/hugepages"
>
> ~ # virsh dumpxml lublin-vfio1 | head -11
> <domain type='kvm' id='3'>
> <name>lublin-vfio1</name>
> <uuid>bc578734-6a43-4fda-9b19-e43225007a83</uuid>
> <memory unit='KiB'>16777216</memory>
> <currentMemory unit='KiB'>16777216</currentMemory>
> <memoryBacking>
> <hugepages>
> <page size='2048' unit='KiB'/>
> </hugepages>
> <nosharepages/>
> </memoryBacking>
>
> ... which entirely ignores memory locality. TIA!


Libvirt has sections in their domain xml specification talking about numa
configurations:
http://libvirt.org/formatdomain.html#elementsNUMATuning

For a simple configuration where the VM fits within a host numa node, you'd
want to add something like:

  <numatune>
    <memory mode="strict" nodeset="1"/>
  </numatune>

Which should allocate all VM memory from numa node 1.  Use the information
in the sysfs paths above to verify that free hugepage count is going down
on the intended node.  If your VM spans NUMA nodes, things get a lot more
complicated and you'll need to fiddle with memnodes here and nodesets in
the <hugepages> setup.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/vfio-users/attachments/20161012/76c0b249/attachment.htm>


More information about the vfio-users mailing list