[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [libvirt] [BUG] When PLUG a bridge interface to an active VM, the generated LIVE and CONFIG mac address are different



On Fri, Aug 30, 2019 at 10:14:17AM +0200, Michal Privoznik wrote:
On 8/30/19 8:51 AM, wujing (O) wrote:
On 8/28/19 11:42 AM, wujing (O) wrote:
On 8/22/19 10:43 AM, Xu Yandong (Yandong Xu) wrote:
That possibly might be out of scope, but autofilling the mac
address as early as virDomainNetDefParseXML also is not ideal.

We have pushed a patch bellow that can restore the situation to an
older
state.

Subject: [PATCH] qemu: use the same def when attaching device live
and config

diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index
617d7d5..eca54d0 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -8615,6 +8615,22 @@
qemuDomainAttachDeviceLiveAndConfig(virDomainObjPtr vm,
        if (!(caps = virQEMUDriverGetCapabilities(driver, false)))
            goto cleanup;

+    if (flags & VIR_DOMAIN_AFFECT_CONFIG &&
+        flags & VIR_DOMAIN_AFFECT_LIVE) {
+        /* If we are affecting both CONFIG and LIVE
+         * use the same xml of device preferentially
+         * to make the configuration consistent.
+         */
+        devLive = virDomainDeviceDefParse(xml, vm->def,

This use of vm->def prefers live XML. This means that for instance
PCI address are assigned based on current PCI layout in live XML
which in general is different to inactive XML.

And it's not only PCI addresses, we autogenerate some out aspects of
<interface/> (e.g. model) and even more for other devices. Just take
a look at qemuDomainDeviceDefPostParse(). All functions there which
take domain def as an argument do so because they are basing
autogenerated value on domain definition. Plus there's more in parser
code.

While your patch might work for your use case, it can break others.

BTW: have you read the original commit that caused this? I'm failing
to see how we would not re-introduce the problem with your patch. If
you're using live XML to validate/generate device addresses, how can
we generate valid address for inactive XML?


This patch is a little different with the original commit, since we
only prefer live XML when using LIVE AND CONFIG flags at the same time.
Still parse separate device def if attach with only one flag.
So this patch is kinda a combination of 1e0534a7702 and 55ce6564634, just
to narrow down the problem influence.

And I verified as the steps in
https://bugzilla.redhat.com/show_bug.cgi?id=1559867, work as expected.

Well, then try to coldplug <hostdev mode='subsystem' type='scsi'/> (that is
only to inactive XML), and then try hotplug different <hostdev type'scsi'/> to
both live and inactive XMLs and the bug will reproduce.
Problem is, that inactive XML contains a device which is not contained in live
XML and thus when generating device address based solely on live XML a
conflicting address is generated.

If we'd prefer inactive XML in your patch then the steps in my reproducer
need to be swapped. At any rate, the issue is still there.

You are right, but I found a new issue on current solution.
Test as steps below.
1. Having a running vm
## virsh list
  Id    Name   State
----------------------                                                  │~
  1    rhel   running

2. prepare 2 scsi hostdev xml using the same address
## cat h1.xml
<hostdev mode='subsystem' type='scsi' managed='no'>
   <source>
     <adapter name='scsi_host2'/>
     <address bus='0' target='0' unit='0'/>
   </source>
   <address type='drive' controller='0' bus='0' target='0' unit='2'/>
</hostdev>

## cat h2.xml
<hostdev mode='subsystem' type='scsi' managed='no'>
   <source>
     <adapter name='scsi_host3'/>
     <address bus='0' target='0' unit='0'/>
   </source>
   <address type='drive' controller='0' bus='0' target='0' unit='2'/>
</hostdev>

3. attach 2 hostdev to vm with 'attach-device --config'
## virsh attach-device rhel h1.xml --config
Device attached successfully

## virsh attach-device rhel h2.xml --config
Device attached successfully

4. Check the vm's xml with 'dumpxml --inactive'
## virsh dumpxml rhel --inactive
...
     <hostdev mode='subsystem' type='scsi' managed='no'>
       <source>
         <adapter name='scsi_host2'/>
         <address bus='0' target='0' unit='0'/>
       </source>
       <address type='drive' controller='0' bus='0' target='0' unit='2'/>
     </hostdev>
     <hostdev mode='subsystem' type='scsi' managed='no'>
       <source>
         <adapter name='scsi_host3'/>
         <address bus='0' target='0' unit='0'/>
       </source>
       <address type='drive' controller='0' bus='0' target='0' unit='2'/>
     </hostdev>
...

a) Looks like libvirt can not verify if user configured the same address for different scsi hostdevs.
Because virDomainHostdevDefPostParse only calls virDomainDriveAddressIsUsedByDisk to check
if address is used by other disks, but no check if used by other hostdevs.
And qemuDomainAttachDeviceConfig calls virDomainHostdevFind to check duplicate devices by
src addr and other stuff, not including addr in vm.

Should that be fixed?

Oh yeah, this is a long known issue:

https://bugzilla.redhat.com/show_bug.cgi?id=1729022

and I want to fix it, but I always get side tracked :-( So if somebody
beats me to it, I wouldn't mind ;-)


b) If we fixed a), then we can get a devConf that both fits to live and config xml by
passing devLive  and domainPersistentDef to virDomainDeviceDefCopy.

diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index 617d7d5..eca54d0 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -8615,6 +8615,22 @@ qemuDomainAttachDeviceLiveAndConfig(virDomainObjPtr vm,
      if (!(caps = virQEMUDriverGetCapabilities(driver, false)))
          goto cleanup;

+    if (flags & VIR_DOMAIN_AFFECT_CONFIG &&
+        flags & VIR_DOMAIN_AFFECT_LIVE) {
+        devLive = virDomainDeviceDefParse(xml, vm->def,
+                                          caps, driver->xmlopt,
+                                          parse_flags);
+        if (!devLive)
+            goto cleanup;
+        vmdef = virDomainObjCopyPersistentDef(vm, caps, driver->xmlopt);
+        if (!vmdef)
+            goto cleanup;
+        /* Use persistent vm def to get a device def copy
+         */
+        devConf = virDomainDeviceDefCopy(devLive, vmdef, caps, driver->xmlopt);
+        if (!devConf)
+            goto cleanup;
+    }

If devLive fits to vmdef, then we get a compatible devConf, or just fail and return.

Yeah, this is exactly the approach I thought of yesterday, but did not
get to write it yet. Although, since this prefers live XML then we may
get into a corner case where @devLive doesn't fit into inactive XML
(e.g. for @devLive an address was generated that's taken in inactive
XML). If we'd preffer inactive XML then we might get into the same
trouble but from the opposite direction. So what we should do, is to:

1) parse device XML using live XML, and create copy using inactive XML
2) if that fails, parse device XML using inactive XML and create copy
   using live XML
3) if that fails too, then probably error out because we can't guarantee
   that device is the same in both XMLs.

Does this sound reasonable to you?

@Martin: When we spoke about this in person you were not a big fan of
3), you suggested to not error out but keep current behaviour, i.e.
parse device using live XML and plug it to live XML and parse device
second time using inactive XML and plut it into inactive XML (which is
guaranteed to plug two different devices, but also poses smaller problem
onto user).

Question is, whether in that case we shouldn't force user to call the
API two times (once with flags = CONFIG and the second time with flags =
LIVE) as it gives the same result and it's more obvious.


I don't mind, I was just thinking with my "back-compat corner case cap" on and
didn't want to cause yet another https://xkcd.com/1172/ (although not really,
because someone might want that behaviour), but I'm fine with both approaches.

Michal

Attachment: signature.asc
Description: PGP signature


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]