[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [libvirt] [PATCHv2 1/3] nodeinfo: Add check and workaround to guarantee valid cpu topologies



On 11/12/12 23:31, Eric Blake wrote:
On 11/09/2012 02:58 AM, Peter Krempa wrote:
Lately there were a few reports of the output of the virsh nodeinfo
command being inaccurate. This patch tries to avoid that by checking if
the topology actually makes sense. If it doesn't we then report a
synthetic topology that indicates to the user that the host capabilities
should be checked for the actual topology.
---
Diff to v1:
- Re-formatted entries in the nodeinfo structure definition
   (model, memory, cpus, mhz, nodes)
- Changed description to entries in nodeinfo structure
   (sockets, cores, threads)
- fixed topology report for offline cores
- changed the output of faked data into nodeinfo->cores and tweaked tests:
tests/nodeinfodata/linux-x86-test7.expected:
CPUs: 24/24, MHz: 2200, Nodes: 1, Sockets: 1, Cores: 24, Threads: 1
tests/nodeinfodata/linux-x86-test8.expected:
CPUs: 64/64, MHz: 2593, Nodes: 1, Sockets: 1, Cores: 64, Threads: 1
(I'm not going to repost those)

Yeah, I agree with that decision on not reposting the tests :)

---
  include/libvirt/libvirt.h.in | 24 +++++++++++++-----------
  src/nodeinfo.c               | 37 +++++++++++++++++++++++++++++++------
  2 files changed, 44 insertions(+), 17 deletions(-)

diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in
index bf584a0..128fc25 100644
--- a/include/libvirt/libvirt.h.in
+++ b/include/libvirt/libvirt.h.in
@@ -531,17 +531,19 @@ typedef virTypedParameter *virTypedParameterPtr;
  typedef struct _virNodeInfo virNodeInfo;

  struct _virNodeInfo {
-    char model[32];     /* string indicating the CPU model */
-    unsigned long memory;/* memory size in kilobytes */
-    unsigned int cpus;  /* the number of active CPUs */
-    unsigned int mhz;   /* expected CPU frequency */
-    unsigned int nodes; /* the number of NUMA cell, 1 for unusual NUMA
-                           topologies or uniform memory access; check
-                           capabilities XML for the actual NUMA topology */
-    unsigned int sockets;/* number of CPU sockets per node if nodes > 1,
-                            total number of CPU sockets otherwise */
-    unsigned int cores; /* number of cores per socket */
-    unsigned int threads;/* number of threads per core */
+    char model[32];       /* string indicating the CPU model */
+    unsigned long memory; /* memory size in kilobytes */
+    unsigned int cpus;    /* the number of active CPUs */
+    unsigned int mhz;     /* expected CPU frequency */
+    unsigned int nodes;   /* the number of NUMA cell, 1 for unusual NUMA
+                             topologies or uniform memory access; check
+                             capabilities XML for the actual NUMA topology */
+    unsigned int sockets; /* number of CPU sockets per node if nodes > 1,
+                             1 in case of unusual NUMA topology */
+    unsigned int cores;   /* number of cores per socker, total number of

s/socker/socket/

+                             processors in case of unusual NUMA topology*/
+    unsigned int threads; /* number of threads per core, 1 in case of
+                             unusual numa topology */

Makes sense, and more importantly, preserves the API of computing max cpus.

      }

+    /* Now check if the topology makes sense. There are machines that don't
+     * expose their real number of nodes or for example the AMD Bulldozer
+     * architecture that exposes their Clustered integer core modules as both
+     * threads and cores. This approach throws off our detection. Unfortunately
+     * the nodeinfo structure isn't designed to carry the full topology so
+     * we're going to lie about the detected topology to notify the user
+     * to check the host capabilities for the actual topology. */
+    if ((nodeinfo->nodes *
+         nodeinfo->sockets *
+         nodeinfo->cores *
+         nodeinfo->threads) != (nodeinfo->cpus + offline)) {
+        nodeinfo->nodes = 1;
+        nodeinfo->sockets = 1;
+        nodeinfo->cores = nodeinfo->cpus + offline;
+        nodeinfo->threads = 1;
+    }

And the comment here is definitely helpful.

ACK.  [Finally]


Thanks for the review. I fixed the typo (and removed trailing whitespace in the cpuinfo file in 3/3) and pushed the series.

Peter


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]