[PATCH] network: Fix a race condition when shutdown & start vm at the same time

Laine Stump laine at redhat.com
Tue Jun 16 23:46:48 UTC 2020


(BTW, this other patch is also trying to solve the same problem:


https://www.redhat.com/archives/libvir-list/2020-June/msg00525.html


I made comments there earlier, and have learned a bit more since then:


https://www.redhat.com/archives/libvir-list/2020-June/msg00634.html


On 6/15/20 2:36 PM, Daniel Henrique Barboza wrote:
>
>
> On 6/11/20 6:58 AM, Bingsong Si wrote:
>> when shutdown vm, the qemuProcessStop cleanup virtual interface in 
>> two steps:
>
> s/when/When
>
>> 1. qemuProcessKill kill qemu process, and vif disappeared
>> 2. ovs-vsctl del-port from the brige
>>
>> if start a vm in the middle of the two steps, the new vm will reused 
>> the vif,
>
> s/if/If
>
>> but removed from bridge by step 2
>>
>> Signed-off-by: Bingsong Si <owen.si at ucloud.cn>
>> ---
>>   src/qemu/qemu_process.c | 8 +++++---
>>   1 file changed, 5 insertions(+), 3 deletions(-)
>>
>> diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
>> index d36088ba98..706248815a 100644
>> --- a/src/qemu/qemu_process.c
>> +++ b/src/qemu/qemu_process.c
>> @@ -7483,9 +7483,11 @@ void qemuProcessStop(virQEMUDriverPtr driver,
>>               if (vport->virtPortType == 
>> VIR_NETDEV_VPORT_PROFILE_MIDONET) {
>> ignore_value(virNetDevMidonetUnbindPort(vport));
>>               } else if (vport->virtPortType == 
>> VIR_NETDEV_VPORT_PROFILE_OPENVSWITCH) {
>> -                ignore_value(virNetDevOpenvswitchRemovePort(
>> - virDomainNetGetActualBridgeName(net),
>> -                                 net->ifname));
>> +                virMacAddr mac;
>> +                if (virNetDevGetMAC(net->ifname, &mac) < 0 ||  
>> !virMacAddrCmp(&mac, &net->mac))


(Before anything else - virNetDevGetMAC() will actually *log* an error 
in libvirt's logs if the device isn't found (which will be in nearly 
100% of all cases). That will lead to people reporting it as a bug, 
which gets very time consuming and expensive for anyone providing 
commercial support for a product that uses libvirt. If it is really 
necessary to check the MAC address of a device that legitimately may or 
may not exist, then there will need to be a "Quiet" version of the 
function that doesn't log any errors.)(Update after thinking about it - 
I don't think we should be checking the MAC address anyway, as it 
doesn't reliably differentiate "new" tap from "old" tap).


>
> Extra space between "||" and "!virMacAddrCmp(.."
>
>
> With these nits fixed:
>
>
> Reviewed-by: Daniel Henrique Barboza <danielhb413 at gmail.com>



This patch is attempting to solve a race condition between one guest 
starting at the same time another is stopping, and the mess that results 
if the new guest uses the same name for its tap device as the old guest 
used. For example, lets say libvirt thread A / guest A is doing this:


A1) the QEMU process is terminated (and tap device, e.g. "vnet0", is 
implicitly deleted) either by libvirtd or by some external force 
(including possibly the QEMU process itself

A2) the tap's associated port (also "vnet0") is removed from the OVS 
switch by libvirt.


While libvirt thread B is doing this:

B1) a new tap device is created for a new QEMU process. If A1 has 
already happened, then the kernel will likely give the new tap the same 
name - "vnet0".

B2) the new tap device is attached to an OVS switch (or possibly a Linux 
host bridge).


The problem occurs when if B2 happens before A2, which could result in 
B2 attaching the new tap to the OVS switch, and then A2 disconnecting it 
from the switch. So libvirt thinks the new QEMU guest tap is attached to 
the switch, but it isn't.


This patch attempts to eliminate the race by checking, prior to removing 
"old tap"s port on the switch, that 1) the tap device doesn't exist, or 
that if it does 2) that the MAC address of the tap device is unchanged.


Assuming that the two guests would not use the same MAC address for 
their tap devices (which is probably the case, but isn't *required* to 
be true), this does significantly narrow the potential time  for a race 
condition, and in particular makes sure that we never remove a port that 
hasn't just been "re-added" by the new QEMU.


However, this just creates a smaller window for the race, and different 
problem for the remainder of the time.

1) smaller window - it would still be possible for the following to happen:


    a) old qemu terminates, tap device "vnet0" is deleted

    b) libvirt checks MAC address and learns that there is no device 
"vnet0",

       so it calls virNetDevOpenvswitchRemovePort(), but before ovs-vsctl

       can be called...

    c) libvirt creates a new tap device for new QEMU, kernel names it 
"vnet0"

    d) libvirt calls virNetDevOpenvswitchAddPort() and the new tap

       device "vnet0" to the switch

    e) the ovs-vsctl from (b) is finally able to run, and removes "vnet0"

       from the switch.


Granted, this is *highly* unlikely, since there is nothing extra between 
checking MAC address and removing the port, but there is nothing 
enforcing it.

2) New Problem - I think the testing here was done with two guests who 
both attached their tap to the same (or another) OVS switch. It's 
relying on the ability to attach the tap device to a new switch/bridge 
even if it is already attached to some other switch bridge. Normally 
ovs-vsctl would refuse to add a port to a switch if a port by that same 
name was already on any OVS switch. I just looked it up though, and in 
this case libvirt is able to make this work by including "--if-exists 
del-port $ifname" in the ovs-vsctl command that *adds* the new port. 
However, if you have the same situation but the new switch device is 
instead a Linux bridge, the attempt to attach to the bridge fails.


So, if thread "B" has already created the new tap device by the time 
thread "A" is deciding whether or not to remove the old port, the port 
won't be removed, and if the new guest "B" is using a Linux host bridge, 
libvirt will fail to attach the new tap to the bridge.


So, a summary of the problems with this patch:


1) The race window is reduced (and may be gone in practical terms), but 
not guaranteed to be eliminated.


2) For the time during the "previous race window start" and "new race 
window start" a new problem has been created - if the new guest uses a 
Linux host bridge, the connection will fail


3) virNetDevGetMAC() will put an error in the system logs if the device 
doesn't exist (and it almost always will *not* exist, so this will be 
significant


4) Although it is almost always the case that two guests will not use 
the same MAC address for their network interfaces, there is nothing 
preventing it - we shouldn't assume that MAC addresses are unique. I 
think that check is actually superfluous, since the qemu process has 
always been terminated by the time we get to that place in the code, so 
the tap device should have been auto-deleted.


What do I think should be done? Good question. Possibly we could:


A) Call virNetDevTapReattachBridge() rather than 
virNetDevTapAttachBridge() in virNetDevTapCreateInBridgePort(). This 
would eliminate problem (2).


B) Instead of checking if the tap device MAC address matches, just call 
virNetDevExists() - if it exists, then skip the RemovePort() - this 
eliminates problems (3) and (4). (NB - this would fail if it turns out 
that tap device deletion isn't completed synchronously with qemu process 
termination!)


C) If we want to make it 100% sound, we need to make "check for 
interface existence + removeport" an atomic operation, and mutually 
exclusive with virNetDevTapCreate(). This would eliminate problem (1)



>
>> + ignore_value(virNetDevOpenvswitchRemovePort(
>> + virDomainNetGetActualBridgeName(net),
>> +                                     net->ifname));
>>               }
>>           }
>>
>




More information about the libvir-list mailing list