[libvirt] [PATCH v4 2/5] numa: describe siblings distances within cells

Wim Ten Have wim.ten.have at oracle.com
Fri Sep 8 14:47:15 UTC 2017


From: Wim ten Have <wim.ten.have at oracle.com>

Add libvirtd NUMA cell domain administration functionality to
describe underlying cell id sibling distances in full fashion
when configuring HVM guests.

Schema updates are made to docs/schemas/cputypes.rng enforcing domain
administration to follow the syntax below the numa cell id and
docs/schemas/basictypes.rng to add "numaDistanceValue".

A minimum value of 10 representing the LOCAL_DISTANCE as 0-9 are
reserved values and can not be used as System Locality Distance Information.
A value of 20 represents the default setting of REMOTE_DISTANCE
where a maximum value of 255 represents UNREACHABLE.

Effectively any cell sibling can be assigned a distance value where
practically 'LOCAL_DISTANCE <= value <= UNREACHABLE'.

[below is an example of a 4 node setup]

  <cpu>
    <numa>
      <cell id='0' cpus='0' memory='2097152' unit='KiB'>
        <distances>
          <sibling id='0' value='10'/>
          <sibling id='1' value='21'/>
          <sibling id='2' value='31'/>
          <sibling id='3' value='41'/>
        </distances>
      </cell>
      <cell id='1' cpus='1' memory='2097152' unit='KiB'>
        <distances>
          <sibling id='0' value='21'/>
          <sibling id='1' value='10'/>
          <sibling id='2' value='31'/>
          <sibling id='3' value='41'/>
        </distances>
      </cell>
      <cell id='2' cpus='2' memory='2097152' unit='KiB'>
        <distances>
          <sibling id='0' value='31'/>
          <sibling id='1' value='21'/>
          <sibling id='2' value='10'/>
          <sibling id='3' value='21'/>
        </distances>
      <cell id='3' cpus='3' memory='2097152' unit='KiB'>
        <distances>
          <sibling id='0' value='41'/>
          <sibling id='1' value='31'/>
          <sibling id='2' value='21'/>
          <sibling id='3' value='10'/>
        </distances>
      </cell>
    </numa>
  </cpu>

Whenever a sibling id the cell LOCAL_DISTANCE does apply and for any
sibling id not being covered a default of REMOTE_DISTANCE is used
for internal computations.

Signed-off-by: Wim ten Have <wim.ten.have at oracle.com>
---
Changes on v1:
- Add changes to docs/formatdomain.html.in describing schema update.
Changes on v2:
- Automatically apply distance symmetry maintaining cell <-> sibling.
- Check for maximum '255' on numaDistanceValue.
- Automatically complete empty distance ranges.
- Check that sibling_id's are in range with cell identifiers.
- Allow non-contiguous ranges, starting from any node id.
- Respect parameters as ATTRIBUTE_NONNULL fix functions and callers.
- Add and apply topology for LOCAL_DISTANCE=10 and REMOTE_DISTANCE=20.
Changes on v3
- Add UNREACHABLE if one locality is unreachable from another.
- Add code cleanup aligning function naming in a separated patch.
- Add numa related driver code in a separated patch.
- Remove <choice> from numaDistanceValue schema/basictypes.rng
- Correct doc changes.
---
 docs/formatdomain.html.in   |  63 +++++++++++++-
 docs/schemas/basictypes.rng |   7 ++
 docs/schemas/cputypes.rng   |  18 ++++
 src/conf/numa_conf.c        | 200 +++++++++++++++++++++++++++++++++++++++++++-
 4 files changed, 284 insertions(+), 4 deletions(-)

diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
index 8ca7637..c453d44 100644
--- a/docs/formatdomain.html.in
+++ b/docs/formatdomain.html.in
@@ -1529,7 +1529,68 @@
     </p>
 
     <p>
-      This guest NUMA specification is currently available only for QEMU/KVM.
+      This guest NUMA specification is currently available only for
+      QEMU/KVM and Xen.  Whereas Xen driver also allows for a distinct
+      description of NUMA arranged <code>sibling</code> <code>cell</code>
+      <code>distances</code> <span class="since">Since 3.6.0</span>.
+    </p>
+
+    <p>
+      Under NUMA h/w architecture, distinct resources such as memory
+      create a designated distance between <code>cell</code> and
+      <code>siblings</code> that now can be described with the help of
+      <code>distances</code>. A detailed description can be found within
+      the ACPI (Advanced Configuration and Power Interface Specification)
+      within the chapter explaining the system's SLIT (System Locality
+      Distance Information Table).
+    </p>
+
+<pre>
+...
+<cpu>
+  ...
+  <numa>
+    <cell id='0' cpus='0,4-7' memory='512000' unit='KiB'>
+      <distances>
+        <sibling id='0' value='10'/>
+        <sibling id='1' value='21'/>
+        <sibling id='2' value='31'/>
+        <sibling id='3' value='41'/>
+      </distances>
+    </cell>
+    <cell id='1' cpus='1,8-10,12-15' memory='512000' unit='KiB' memAccess='shared'>
+      <distances>
+        <sibling id='0' value='21'/>
+        <sibling id='1' value='10'/>
+        <sibling id='2' value='21'/>
+        <sibling id='3' value='31'/>
+      </distances>
+    </cell>
+    <cell id='2' cpus='2,11' memory='512000' unit='KiB' memAccess='shared'>
+      <distances>
+        <sibling id='0' value='31'/>
+        <sibling id='1' value='21'/>
+        <sibling id='2' value='10'/>
+        <sibling id='3' value='21'/>
+      </distances>
+    </cell>
+    <cell id='3' cpus='3' memory='512000' unit='KiB'>
+      <distances>
+        <sibling id='0' value='41'/>
+        <sibling id='1' value='31'/>
+        <sibling id='2' value='21'/>
+        <sibling id='3' value='10'/>
+      </distances>
+    </cell>
+  </numa>
+  ...
+</cpu>
+...</pre>
+
+    <p>
+      Under Xen driver, if no <code>distances</code> are given to describe
+      the SLIT data between different cells, it will default to a scheme
+      using 10 for local and 20 for remote distances.
     </p>
 
     <h3><a id="elementsEvents">Events configuration</a></h3>
diff --git a/docs/schemas/basictypes.rng b/docs/schemas/basictypes.rng
index 1ea667c..1a18cd3 100644
--- a/docs/schemas/basictypes.rng
+++ b/docs/schemas/basictypes.rng
@@ -77,6 +77,13 @@
     </choice>
   </define>
 
+  <define name="numaDistanceValue">
+    <data type="unsignedInt">
+      <param name="minInclusive">10</param>
+      <param name="maxInclusive">255</param>
+    </data>
+  </define>
+
   <define name="pciaddress">
     <optional>
       <attribute name="domain">
diff --git a/docs/schemas/cputypes.rng b/docs/schemas/cputypes.rng
index 3eef16a..c45b6df 100644
--- a/docs/schemas/cputypes.rng
+++ b/docs/schemas/cputypes.rng
@@ -129,6 +129,24 @@
           </choice>
         </attribute>
       </optional>
+      <optional>
+        <element name="distances">
+          <oneOrMore>
+            <ref name="numaDistance"/>
+          </oneOrMore>
+        </element>
+      </optional>
+    </element>
+  </define>
+
+  <define name="numaDistance">
+    <element name="sibling">
+      <attribute name="id">
+        <ref name="unsignedInt"/>
+      </attribute>
+      <attribute name="value">
+        <ref name="numaDistanceValue"/>
+      </attribute>
     </element>
   </define>
 
diff --git a/src/conf/numa_conf.c b/src/conf/numa_conf.c
index b71dc01..5db4311 100644
--- a/src/conf/numa_conf.c
+++ b/src/conf/numa_conf.c
@@ -29,6 +29,15 @@
 #include "virnuma.h"
 #include "virstring.h"
 
+/*
+ * Distance definitions defined Conform ACPI 2.0 SLIT.
+ * See include/linux/topology.h
+ */
+#define LOCAL_DISTANCE          10
+#define REMOTE_DISTANCE         20
+/* SLIT entry value is a one-byte unsigned integer. */
+#define UNREACHABLE            255
+
 #define VIR_FROM_THIS VIR_FROM_DOMAIN
 
 VIR_ENUM_IMPL(virDomainNumatuneMemMode,
@@ -48,6 +57,8 @@ VIR_ENUM_IMPL(virDomainMemoryAccess, VIR_DOMAIN_MEMORY_ACCESS_LAST,
               "shared",
               "private")
 
+typedef struct _virDomainNumaDistance virDomainNumaDistance;
+typedef virDomainNumaDistance *virDomainNumaDistancePtr;
 
 typedef struct _virDomainNumaNode virDomainNumaNode;
 typedef virDomainNumaNode *virDomainNumaNodePtr;
@@ -66,6 +77,12 @@ struct _virDomainNuma {
         virBitmapPtr nodeset;   /* host memory nodes where this guest node resides */
         virDomainNumatuneMemMode mode;  /* memory mode selection */
         virDomainMemoryAccess memAccess; /* shared memory access configuration */
+
+        struct _virDomainNumaDistance {
+            unsigned int value; /* locality value for node i->j or j->i */
+            unsigned int cellid;
+        } *distances;           /* remote node distances */
+        size_t ndistances;
     } *mem_nodes;           /* guest node configuration */
     size_t nmem_nodes;
 
@@ -686,6 +703,153 @@ virDomainNumatuneNodesetIsAvailable(virDomainNumaPtr numatune,
 }
 
 
+static int
+virDomainNumaDefNodeDistanceParseXML(virDomainNumaPtr def,
+                                     xmlXPathContextPtr ctxt,
+                                     unsigned int cur_cell)
+{
+    int ret = -1;
+    int sibling;
+    char *tmp = NULL;
+    xmlNodePtr *nodes = NULL;
+    size_t i, ndistances = def->nmem_nodes;
+
+    if (!ndistances)
+        return 0;
+
+    /* check if NUMA distances definition is present */
+    if (!virXPathNode("./distances[1]", ctxt))
+        return 0;
+
+    if ((sibling = virXPathNodeSet("./distances[1]/sibling", ctxt, &nodes)) <= 0) {
+        virReportError(VIR_ERR_XML_ERROR, "%s",
+                       _("NUMA distances defined without siblings"));
+        goto cleanup;
+    }
+
+    for (i = 0; i < sibling; i++) {
+        virDomainNumaDistancePtr ldist, rdist;
+        unsigned int sibling_id, sibling_value;
+
+        /* siblings are in order of parsing or explicitly numbered */
+        if (!(tmp = virXMLPropString(nodes[i], "id"))) {
+            virReportError(VIR_ERR_XML_ERROR,
+                           _("Missing 'id' attribute in NUMA "
+                             "distances under 'cell id %d'"),
+                           cur_cell);
+            goto cleanup;
+        }
+
+        /* The "id" needs to be applicable */
+        if (virStrToLong_uip(tmp, NULL, 10, &sibling_id) < 0) {
+            virReportError(VIR_ERR_XML_ERROR,
+                           _("Invalid 'id' attribute in NUMA "
+                             "distances for sibling: '%s'"),
+                           tmp);
+            goto cleanup;
+        }
+        VIR_FREE(tmp);
+
+        /* The "id" needs to be within numa/cell range */
+        if (sibling_id >= ndistances) {
+            virReportError(VIR_ERR_XML_ERROR,
+                           _("There is no cell administrated matching "
+                             "'sibling_id %d' under NUMA 'cell id %d' "),
+                           sibling_id, cur_cell);
+            goto cleanup;
+        }
+
+        /* We need a locality value. Check and correct
+         * distance to local and distance to remote node.
+         */
+        if (!(tmp = virXMLPropString(nodes[i], "value"))) {
+            virReportError(VIR_ERR_XML_ERROR,
+                           _("Missing 'value' attribute in NUMA distances "
+                             "under 'cell id %d' for 'sibling id %d'"),
+                           cur_cell, sibling_id);
+            goto cleanup;
+        }
+
+        /* The "value" needs to be applicable */
+        if (virStrToLong_uip(tmp, NULL, 10, &sibling_value) < 0) {
+            virReportError(VIR_ERR_XML_ERROR,
+                           _("Invalid 'value' attribute in NUMA "
+                             "distances for value: '%s'"),
+                           tmp);
+            goto cleanup;
+        }
+        VIR_FREE(tmp);
+
+        /* LOCAL_DISTANCE <= "value" <= UNREACHABLE */
+        if (sibling_value < LOCAL_DISTANCE ||
+            sibling_value > UNREACHABLE) {
+            virReportError(VIR_ERR_XML_ERROR,
+                           _("Out of range value '%d' set for "
+                             "'sibling id %d' under NUMA 'cell id %d' "),
+                           sibling_value, sibling_id, cur_cell);
+            goto cleanup;
+        }
+
+        ldist = def->mem_nodes[cur_cell].distances;
+        if (!ldist) {
+            if (def->mem_nodes[cur_cell].ndistances) {
+                virReportError(VIR_ERR_XML_ERROR,
+                               _("Invalid 'ndistances' set in NUMA "
+                                 "distances for sibling id: '%d'"),
+                               cur_cell);
+                goto cleanup;
+            }
+
+            if (VIR_ALLOC_N(ldist, ndistances) < 0)
+                goto cleanup;
+
+            if (!ldist[cur_cell].value)
+                ldist[cur_cell].value = LOCAL_DISTANCE;
+            ldist[cur_cell].cellid = cur_cell;
+            def->mem_nodes[cur_cell].ndistances = ndistances;
+        }
+
+        ldist[sibling_id].cellid = sibling_id;
+        ldist[sibling_id].value = sibling_value;
+        def->mem_nodes[cur_cell].distances = ldist;
+
+        rdist = def->mem_nodes[sibling_id].distances;
+        if (!rdist) {
+            if (def->mem_nodes[sibling_id].ndistances) {
+                virReportError(VIR_ERR_XML_ERROR,
+                               _("Invalid 'ndistances' set in NUMA "
+                                 "distances for sibling id: '%d'"),
+                               sibling_id);
+                goto cleanup;
+            }
+
+            if (VIR_ALLOC_N(rdist, ndistances) < 0)
+                goto cleanup;
+
+            if (!rdist[sibling_id].value)
+                rdist[sibling_id].value = LOCAL_DISTANCE;
+            rdist[sibling_id].cellid = sibling_id;
+            def->mem_nodes[sibling_id].ndistances = ndistances;
+        }
+
+        rdist[cur_cell].cellid = cur_cell;
+        rdist[cur_cell].value = sibling_value;
+        def->mem_nodes[sibling_id].distances = rdist;
+    }
+
+    ret = 0;
+
+ cleanup:
+    if (ret) {
+        for (i = 0; i < ndistances; i++)
+            VIR_FREE(def->mem_nodes[i].distances);
+    }
+    VIR_FREE(nodes);
+    VIR_FREE(tmp);
+
+    return ret;
+}
+
 int
 virDomainNumaDefCPUParseXML(virDomainNumaPtr def,
                             xmlXPathContextPtr ctxt)
@@ -694,7 +858,7 @@ virDomainNumaDefCPUParseXML(virDomainNumaPtr def,
     xmlNodePtr oldNode = ctxt->node;
     char *tmp = NULL;
     int n;
-    size_t i;
+    size_t i, j;
     int ret = -1;
 
     /* check if NUMA definition is present */
@@ -712,7 +876,6 @@ virDomainNumaDefCPUParseXML(virDomainNumaPtr def,
     def->nmem_nodes = n;
 
     for (i = 0; i < n; i++) {
-        size_t j;
         int rc;
         unsigned int cur_cell = i;
 
@@ -788,6 +951,10 @@ virDomainNumaDefCPUParseXML(virDomainNumaPtr def,
             def->mem_nodes[cur_cell].memAccess = rc;
             VIR_FREE(tmp);
         }
+
+        /* Parse NUMA distances info */
+        if (virDomainNumaDefNodeDistanceParseXML(def, ctxt, cur_cell) < 0)
+                goto cleanup;
     }
 
     ret = 0;
@@ -815,6 +982,8 @@ virDomainNumaDefCPUFormatXML(virBufferPtr buf,
     virBufferAddLit(buf, "<numa>\n");
     virBufferAdjustIndent(buf, 2);
     for (i = 0; i < ncells; i++) {
+        int ndistances;
+
         memAccess = virDomainNumaGetNodeMemoryAccessMode(def, i);
 
         if (!(cpustr = virBitmapFormat(virDomainNumaGetNodeCpumask(def, i))))
@@ -829,7 +998,32 @@ virDomainNumaDefCPUFormatXML(virBufferPtr buf,
         if (memAccess)
             virBufferAsprintf(buf, " memAccess='%s'",
                               virDomainMemoryAccessTypeToString(memAccess));
-        virBufferAddLit(buf, "/>\n");
+
+        ndistances = def->mem_nodes[i].ndistances;
+        if (!ndistances) {
+            virBufferAddLit(buf, "/>\n");
+        } else {
+            size_t j;
+            virDomainNumaDistancePtr distances = def->mem_nodes[i].distances;
+
+            virBufferAddLit(buf, ">\n");
+            virBufferAdjustIndent(buf, 2);
+            virBufferAddLit(buf, "<distances>\n");
+            virBufferAdjustIndent(buf, 2);
+            for (j = 0; j < ndistances; j++) {
+                if (distances[j].value) {
+                    virBufferAddLit(buf, "<sibling");
+                    virBufferAsprintf(buf, " id='%d'", distances[j].cellid);
+                    virBufferAsprintf(buf, " value='%d'", distances[j].value);
+                    virBufferAddLit(buf, "/>\n");
+                }
+            }
+            virBufferAdjustIndent(buf, -2);
+            virBufferAddLit(buf, "</distances>\n");
+            virBufferAdjustIndent(buf, -2);
+            virBufferAddLit(buf, "</cell>\n");
+        }
+
         VIR_FREE(cpustr);
     }
     virBufferAdjustIndent(buf, -2);
-- 
2.9.5




More information about the libvir-list mailing list