[libvirt] [PATCH v2 3/3] qemu: Enable NUMA node tag in pci-root for PPC64

Andrea Bolognani abologna at redhat.com
Thu Jul 20 15:46:58 UTC 2017


On Mon, 2017-07-17 at 21:29 +0530, Shivaprasad G Bhat wrote:
> @@ -3786,6 +3786,11 @@
>          part of the specified NUMA node (it is up to the user of the
>          libvirt API to attach host devices to the correct
>          pci-expander-bus when assigning them to the domain).
> +        On PPC64, the PCI devices can be specified to be part of a NUMA
> +        node using only the pci-root controller with an optional
> +        <code><node></code> subelement within the
> +        <code><target></code> subelement. The PCI devices on the
> +        given pci-root controller will be part of the specified NUMA node.

Instead of adding an entire new sentence here, it would make
more sense to rephrase what's already present, something like:

  Some PCI controllers (pci-expander-bus for the pc machine
  type, pcie-expander-bus for the q35 machine type and
  pci-root for the pseries machine type) can have an optional
  <node> subelement [...]

> @@ -9457,6 +9457,12 @@ virDomainControllerDefParseXML(xmlNodePtr node,
>                  goto error;
>              }
>          }
> +        if (def->idx == 0 && numaNode >= 0) {
> +            virReportError(VIR_ERR_XML_ERROR, "%s",
> +                           _("Only the PCI controller with index != 0 can "
> +                             "have NUMA node property specified"));
> +                goto error;
> +        }
>          if (numaNode >= 0)
>              def->opts.pciopts.numaNode = numaNode;

The check you're adding can be merged with the one below it.

The error message should also be reworded, something like:

  The PCI controller with index=0 can't be associated with
  a NUMA node.

> @@ -3458,9 +3458,14 @@ qemuDomainControllerDefPostParse(virDomainControllerDefPtr cont,
>           * that NUMA node is configured in the guest <cpu><numa>
>           * array. NUMA cell id's in this array are numbered
>           * from 0 .. size-1.
> +         *
> +         * On PSeries, the NUMA node is set at the PHB.

As above, reworking the existing comment would work better
than merely appending to it.

>           */
> -        if ((cont->model == VIR_DOMAIN_CONTROLLER_MODEL_PCI_EXPANDER_BUS ||
> -             cont->model == VIR_DOMAIN_CONTROLLER_MODEL_PCIE_EXPANDER_BUS) &&
> +        if (((qemuDomainIsPSeries(def) &&
> +              cont->type == VIR_DOMAIN_CONTROLLER_TYPE_PCI &&
> +              cont->model == VIR_DOMAIN_CONTROLLER_MODEL_PCI_ROOT) ||
> +             (cont->model == VIR_DOMAIN_CONTROLLER_MODEL_PCI_EXPANDER_BUS ||
> +              cont->model == VIR_DOMAIN_CONTROLLER_MODEL_PCIE_EXPANDER_BUS)) &&
>              (int) virDomainNumaGetNodeCount(def->numa)
>              <= cont->opts.pciopts.numaNode) {

I actually don't see any point in the condition being this
complex: it could just be

  if (cont->opts.pciopts.numaNode >= 0 &&
      cont->opts.pciopts.numaNode >=
      (int) virDomainNumaGetNodeCount(def->numa))

because we've already made sure, while parsing, that numaNode
is only set for controllers that allow it.

> +++ b/tests/qemuxml2argvdata/qemuxml2argv-spapr-pci-host-bridge-numa-node.xml
> @@ -0,0 +1,54 @@
> +<domain type='qemu'>
> +  <name>QEMUGuest1</name>
> +  <uuid>87eedafe-eedc-4336-8130-ed9fe5dc90c8</uuid>
> +  <memory unit='KiB'>2097152</memory>
> +  <currentMemory unit='MiB'>2048</currentMemory>

You don't need <currentMemory>

> +  <vcpu placement='static'>8</vcpu>
> +  <numatune>
> +    <memory mode='strict' nodeset='1'/>
> +  </numatune>
> +  <cpu>
> +    <topology sockets='3' cores='1' threads='8'/>
> +    <numa>
> +      <cell id='0' cpus='0-3' memory='1048576' unit='KiB'/>
> +      <cell id='1' cpus='4-7' memory='1048576' unit='KiB'/>
> +    </numa>
> +  </cpu>
> +  <os>
> +    <type arch='ppc64' machine='pseries'>hvm</type>
> +    <boot dev='hd'/>

<boot> is unnecessary.

> +  </os>
> +  <clock offset='utc'/>
> +  <on_poweroff>destroy</on_poweroff>
> +  <on_reboot>restart</on_reboot>
> +  <on_crash>destroy</on_crash>

<clock> and <on_*> can be removed.

> +  <devices>
> +    <emulator>/usr/bin/qemu-system-ppc64</emulator>
> +    <disk type='block' device='disk'>
> +      <driver name='qemu' type='raw'/>
> +      <source dev='/dev/HostVG/QEMUGuest1'/>
> +      <target dev='hda' bus='scsi'/>
> +      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
> +    </disk>

<disk> is irrelevant for the test case at hand.

> +    <controller type='usb' index='0'/>

The model should be 'none'.

> +    <controller type='scsi' index='0'/>

The SCSI controller is also not useful here.

> +    <controller type='pci' index='0' model='pci-root'>
> +      <target index='0'/>
> +    </controller>
> +    <controller type='pci' index='1' model='pci-root'>
> +      <target index='1'>
> +        <node>1</node>
> +      </target>
> +    </controller>
> +    <controller type='pci' index='2' model='pci-root'>
> +      <target index='2'/>
> +    </controller>
> +    <controller type='pci' index='3' model='pci-root'>
> +      <target index='3'>
> +        <node>0</node>
> +      </target>
> +    </controller>
> +    <memballoon model='none'/>
> +    <panic model='pseries'/>

You can omit <panic>.

> +++ b/tests/qemuxml2argvtest.c
> @@ -2739,6 +2739,9 @@ mymain(void)
>      DO_TEST_PARSE_ERROR("cpu-cache-emulate-l2", QEMU_CAPS_KVM);
>      DO_TEST_PARSE_ERROR("cpu-cache-passthrough3", QEMU_CAPS_KVM);
>      DO_TEST_PARSE_ERROR("cpu-cache-passthrough-l3", QEMU_CAPS_KVM);
> +    DO_TEST("spapr-pci-host-bridge-numa-node", QEMU_CAPS_NUMA,
> +            QEMU_CAPS_DEVICE_SPAPR_PCI_HOST_BRIDGE,
> +            QEMU_CAPS_SPAPR_PCI_HOST_BRIDGE_NUMA_NODE);

This should be moved up to be close to the pserie-phb* test
cases, and renamed to something like 'pseries-phb-numa-node'.

There should be one capability per line.

You also need to have an identical test case in qemuxml2xml,
and at least one negative test to show that trying to
assign the default PHB to a NUMA node will result in failure.

-- 
Andrea Bolognani / Red Hat / Virtualization




More information about the libvir-list mailing list