[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [libvirt] [PATCH] qemu: Don't use -mem-prealloc among with .prealloc=yes



On Wed, Nov 07, 2018 at 10:47:01AM +0100, Michal Privoznik wrote:
On 11/07/2018 12:43 AM, John Ferlan wrote:


On 11/5/18 9:49 AM, Michal Privoznik wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1624223

There are two ways to request memory preallocation on cmd line:
-mem-prealloc and .prealloc attribute to memory-backend-file.

s/to/for a/ ?

However, as it turns out it's not safe to use both at the same
time. Prefer -mem-prealloc as it is more backward compatible
compared to switching to "-numa node,memdev=  + -object
memory-backend-file".


FWIW: Issue introduced by commit 1c4f3b56..

While I understand the reasoning, it's really too bad we couldn't "move"
the determination over which conflicting qualifier is used to earlier.
By the time we call the -numa backend we would already have had to make
the choice if I'm reading the ordering right.

Correct, you're reading it right.


But if it doesn't matter for the -numa object to use the -mem-prealloc,
then who am I to complain.  Of course the "future thinking" me that is
living in the present issues surrounding machine and pc makes me wonder
if choosing this as the default going forward into the future where
someone could deprecate the -mem-prealloc because -numa will be so
prevelant won't bite us down the road.

If -mem-prealloc is deprecated then we would have to construct -object
memory-backend-file. I'm not against this, but IIRC this fails during
migration. I mean, if you have a guest that uses -mem-path you can't
migrate it to -object memory-backing-file because qemu would fail to
load the migration stream. That is why we have @needBackend in
qemuBuildNumaArgStr(), so that new cmd line is built iff really needed.

This is the reason I went this way even though BZ suggests otherwise.


Curious how others feel - I'm not against this choice, just trying to
supply an opposing/differing viewpoint. We really have to start coding
for the future and consider what deprecation could mean especially for
arguments that essentially mean the same thing.

Signed-off-by: Michal Privoznik <mprivozn redhat com>
---
 src/qemu/qemu_command.c                       | 37 +++++++++++++------
 src/qemu/qemu_command.h                       |  1 +
 src/qemu/qemu_domain.c                        |  2 +
 src/qemu/qemu_domain.h                        |  3 ++
 src/qemu/qemu_hotplug.c                       |  3 +-
 .../hugepages-numa-default-dimm.args          |  2 +-
 6 files changed, 35 insertions(+), 13 deletions(-)

diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
index e338d3172e..0294030f0e 100644
--- a/src/qemu/qemu_command.c
+++ b/src/qemu/qemu_command.c
@@ -3123,6 +3123,7 @@ qemuBuildControllerDevCommandLine(virCommandPtr cmd,
  * @def: domain definition object
  * @mem: memory definition object
  * @autoNodeset: fallback nodeset in case of automatic NUMA placement
+ * @forbidPrealloc: don't set prealloc attribute

Slight bikeshed, but this changes the priv->memAlloc to @forbidPrealloc
which is IMO a bit odd.

Okay, what name do you suggest? My reasoning for the name was that it
should make sense from the function POV. That's why calling the variable
'memAlloc' did not make sense to me.


Beyond that, this becomes the 3rd @priv field to be passed along...
Maybe @priv should just be passed to access qemuCaps, autoNodeset, and
memPrealloc.

Ah sure.


  * @force: forcibly use one of the backends
  *
  * Creates a configuration object that represents memory backend of given guest
@@ -3136,6 +3137,9 @@ qemuBuildControllerDevCommandLine(virCommandPtr cmd,
  * Then, if one of the two memory-backend-* should be used, the @qemuCaps is
  * consulted to check if qemu does support it.
  *
+ * If @forbidPrealloc is true then 'prealloc' attribute of the backend is not
+ * set. This may come handy when global -mem-prealloc is already specified.
+ *
  * Returns: 0 on success,
  *          1 on success and if there's no need to use memory-backend-*
  *         -1 on error.
@@ -3148,6 +3152,7 @@ qemuBuildMemoryBackendProps(virJSONValuePtr *backendProps,
                             virDomainDefPtr def,
                             virDomainMemoryDefPtr mem,
                             virBitmapPtr autoNodeset,
+                            bool forbidPrealloc,
                             bool force)
 {
     const char *backendType = "memory-backend-file";
@@ -3265,11 +3270,13 @@ qemuBuildMemoryBackendProps(virJSONValuePtr *backendProps,
         if (mem->nvdimmPath) {
             if (VIR_STRDUP(memPath, mem->nvdimmPath) < 0)
                 goto cleanup;
-            prealloc = true;
+            if (!forbidPrealloc)
+                prealloc = true;
         } else if (useHugepage) {
             if (qemuGetDomainHupageMemPath(def, cfg, pagesize, &memPath) < 0)
                 goto cleanup;
-            prealloc = true;
+            if (!forbidPrealloc)
+                prealloc = true;
         } else {
             /* We can have both pagesize and mem source. If that's the case,
              * prefer hugepages as those are more specific. */
@@ -3398,7 +3405,8 @@ qemuBuildMemoryCellBackendStr(virDomainDefPtr def,
     mem.info.alias = alias;

     if ((rc = qemuBuildMemoryBackendProps(&props, alias, cfg, priv->qemuCaps,
-                                          def, &mem, priv->autoNodeset, false)) < 0)
+                                          def, &mem, priv->autoNodeset,
+                                          priv->memPrealloc, false)) < 0)
         goto cleanup;

     if (virQEMUBuildObjectCommandlineFromJSON(buf, props) < 0)
@@ -3435,7 +3443,8 @@ qemuBuildMemoryDimmBackendStr(virBufferPtr buf,
         goto cleanup;

     if (qemuBuildMemoryBackendProps(&props, alias, cfg, priv->qemuCaps,
-                                    def, mem, priv->autoNodeset, true) < 0)
+                                    def, mem, priv->autoNodeset,
+                                    priv->memPrealloc, true) < 0)
         goto cleanup;

     if (virQEMUBuildObjectCommandlineFromJSON(buf, props) < 0)
@@ -7443,7 +7452,8 @@ qemuBuildSmpCommandLine(virCommandPtr cmd,
 static int
 qemuBuildMemPathStr(virQEMUDriverConfigPtr cfg,
                     const virDomainDef *def,
-                    virCommandPtr cmd)
+                    virCommandPtr cmd,
+                    qemuDomainObjPrivatePtr priv)
 {
     const long system_page_size = virGetSystemPageSizeKB();
     char *mem_path = NULL;
@@ -7465,8 +7475,10 @@ qemuBuildMemPathStr(virQEMUDriverConfigPtr cfg,
         return 0;
     }

-    if (def->mem.allocation != VIR_DOMAIN_MEMORY_ALLOCATION_IMMEDIATE)
+    if (def->mem.allocation != VIR_DOMAIN_MEMORY_ALLOCATION_IMMEDIATE) {
         virCommandAddArgList(cmd, "-mem-prealloc", NULL);
+        priv->memPrealloc = true;
+    }

     virCommandAddArgList(cmd, "-mem-path", mem_path, NULL);
     VIR_FREE(mem_path);
@@ -7479,7 +7491,8 @@ static int
 qemuBuildMemCommandLine(virCommandPtr cmd,
                         virQEMUDriverConfigPtr cfg,
                         const virDomainDef *def,
-                        virQEMUCapsPtr qemuCaps)
+                        virQEMUCapsPtr qemuCaps,
+                        qemuDomainObjPrivatePtr priv)
 {
     if (qemuDomainDefValidateMemoryHotplug(def, qemuCaps, NULL) < 0)
         return -1;
@@ -7498,15 +7511,17 @@ qemuBuildMemCommandLine(virCommandPtr cmd,
                               virDomainDefGetMemoryInitial(def) / 1024);
     }

-    if (def->mem.allocation == VIR_DOMAIN_MEMORY_ALLOCATION_IMMEDIATE)
+    if (def->mem.allocation == VIR_DOMAIN_MEMORY_ALLOCATION_IMMEDIATE) {
         virCommandAddArgList(cmd, "-mem-prealloc", NULL);
+        priv->memPrealloc = true;
+    }

I find it "confusing" that setting memPrealloc = true when
"def->mem.allocation == VIR_DOMAIN_MEMORY_ALLOCATION_IMMEDIATE";
however, in qemuBuildMemPathStr it's a != comparison.

I know it's existing, but strange.

This is so that -mem-prealloc is not added twice onto the cmd line. The
first addition is done here and the second is done possibly in
qemuBuildMemPathStr ..


Yeah, the memory code is... I have no words for that.  The problem is there is
not much re-factorings that would work in all freaking corner cases that we
(unfortunately) decided to cover.

It's as Michal says, on this occasion the mempath should be added here
unconditionally, but the condition is there to make it not appear twice.  Also
it needs to be added there before the memory objects so setting a variable and
then acting upon that is not the right refactor (I hope I remember that
correctly).

If you put it in the caller, then you get to the point it's not the only one...
I don't remember how the second caller handles the prealloc, but I remember we
tried bunch of refactors and there's not much to do until we deprecate some QEMU
versions.


Again, I'm not against this, but would like to see if someone with more
numa experience chimes in (Martin?) and whether we need to think more in
terms of what deprecation could mean.

It would mean inability to migrate to newer libvirt.


Well, since we deprecated some QEMU versions (finally), we can just add a flag
in the migration cookie that will tell us if the current libvirt version is
preferring the .prealloc and just start using that for newly started VMs.
Migrating to older version won't work, but that's not supported.  Unless
exceptions of course, but anyone can handle that by backporting the support for
the flag if that's your (or your distro's vendor's) need.

TL;DR: The pre-existing condition is actually fine, unfortunately.


John


     /*
      * Add '-mem-path' (and '-mem-prealloc') parameter here if
      * the hugepages and no numa node is specified.
      */
     if (!virDomainNumaGetNodeCount(def->numa) &&
-        qemuBuildMemPathStr(cfg, def, cmd) < 0)
+        qemuBuildMemPathStr(cfg, def, cmd, priv) < 0)

.. called here.

Michal

--
libvir-list mailing list
libvir-list redhat com
https://www.redhat.com/mailman/listinfo/libvir-list

Attachment: signature.asc
Description: Digital signature


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]