[libvirt] [PATCH v2] qemu: Do not reattach PCI device used by other domain when shutdown

Osier Yang jyang at redhat.com
Mon Oct 17 10:44:26 UTC 2011


于 2011年10月17日 09:40, Osier Yang 写道:
> 于 2011年10月15日 02:53, Eric Blake 写道:
>> On 10/12/2011 10:05 PM, Osier Yang wrote:
>>> When failing on starting a domain, it tries to reattach all the PCI
>>> devices defined in the domain conf, regardless of whether the devices
>>> are still used by other domain. This will cause the devices to be 
>>> deleted
>>> from the list qemu_driver->activePciHostdevs, thus the devices will be
>>> thought as usable even if it's not true. And following commands
>>> nodedev-{reattach,reset} will be successful.
>>>
>>> How to reproduce:
>>>    1) Define two domains with same PCI device defined in the confs.
>>>    2) # virsh start domain1
>>>    3) # virsh start domain2
>>>    4) # virsh nodedev-reattach $pci_device
>>
>> This time around, I actually spent time reproducing the bug scenario 
>> on hardware rather than just analyzing by inspection (it took me a 
>> while to figure out why the same machine that used to let me do pci 
>> passthrough was no longer working, until I realized that my hardware 
>> is old enough to be insecure, and I've upgraded software since my 
>> last passthrough test, so the CVE fix from 
>> https://bugzilla.redhat.com/show_bug.cgi?id=715555 has to be 
>> intentionally bypassed for me to get passthrough again).  With that 
>> testing, I proved that this patch prevents that problem.  But, as 
>> written, your patch caused the dreaded "An error occurred, but the 
>> cause is unknown".
>>
>>>
>>> You will see the device will be reattached to host successfully.
>>> As pciDeviceReattach just check if the device is still used by
>>> other domain via checking if the device is in list 
>>> driver->activePciHostdevs,
>>> however, the device is deleted from the list by step 2).
>>>
>>> This patch is to prohibit the bug by:
>>>    1) Prohibit a domain starting or device attachment right at
>>>       preparation period (qemuPrepareHostdevPCIDevices) if the
>>>       device is in list driver->activePciHostdevs, which means
>>>       it's used by other domain.
>>>
>>>    2) Introduces a new field for struct _pciDevice, (const char 
>>> *used_by),
>>>       it will be set as the domain name at preparation period,
>>>       (qemuPrepareHostdevPCIDevices). Thus we can prohibit deleting
>>>       the device from driver->activePciHostdevs if it's still used by
>>>       other domain when stopping the domain process.
>>>
>>
>>> @@ -111,7 +112,7 @@ int qemuPrepareHostdevPCIDevices(struct 
>>> qemud_driver *driver,
>>>       if (!(pcidevs = qemuGetPciHostDeviceList(hostdevs, nhostdevs)))
>>>           return -1;
>>>
>>> -    /* We have to use 4 loops here. *All* devices must
>>> +    /* We have to use 6 loops here. *All* devices must
>>>        * be detached before we reset any of them, because
>>>        * in some cases you have to reset the whole PCI,
>>>        * which impacts all devices on it. Also, all devices
>>
>> I added further loop labels.
>>
>>> +        /* The device is in use by other active domain if
>>> +         * the dev is in list driver->activePciHostdevs.
>>> +         */
>>> +        if (!pciDeviceIsAssignable(dev, !driver->relaxedACS) ||
>>> +            pciDeviceListFind(driver->activePciHostdevs, dev))
>>> +            goto cleanup;
>>> +    }
>>
>> Here's where the bad message comes in - you jump to cleanup without 
>> ever emitting an error message.  Not to mention now that we have the 
>> new used_by field, the message could actually be informative :)  With 
>> my changes below, I get the much nicer:
>>
>> Error starting domain: Requested operation is not valid: PCI device 
>> 0000:0a:0a.0 is in use by domain dom
>
> Oh, right, thanks for the fixing, :)
>
>>
>>> @@ -277,6 +298,18 @@ void qemuDomainReAttachHostdevDevices(struct 
>>> qemud_driver *driver,
>>>
>>>       for (i = 0; i<  pciDeviceListCount(pcidevs); i++) {
>>>           pciDevice *dev = pciDeviceListGet(pcidevs, i);
>>> +        pciDevice *activeDev = NULL;
>>> +        const char *used_by = NULL;
>>> +
>>> +        /* Never delete the dev from list driver->activePciHostdevs
>>> +         *  if it's used by other domain.
>>> +         */
>>> +        activeDev = pciDeviceListFind(driver->activePciHostdevs, dev);
>>> +        if (activeDev&&
>>> +            (used_by = pciDeviceGetUsedBy(activeDev))&&
>>> +            STRNEQ(name, used_by))
>>> +            continue;
>>
>>>
>>> used_by is kept as it might be NULL.
>>>
>>
>> In that case, use STRNEQ_NULLABLE.  But you are right - across 
>> libvirtd restarts, we lose track of which domain owns the device - so 
>> I did indeed reproduce a case of NULL.  Perhaps food for a later 
>> patch (when reloading domains, cycle through all hostdevs in use by 
>> active domains to repopulate the used_by fields).  But not a 
>> show-stopper to getting this bug fix in now.
>
> Yes, I didn't realize this, we need further patch.
>

After thinking a while, even we set the used_by during domains reloading,
the other attrs (dev->unbind_from_stub, dev->reprobe, dev->remove_slot)
of the PCI dev still will be lost. We can't known what values should be
for these attrs as generally they are initialized when do preparation,
however the preparations are already passed, and the devices are using
by the running domains. And thus the device won't be bound to the
original driver (if it had) correctly, or mistakenly bound to some driver
but it didn't have one.

Perhaps we need some new XMLs introduced for hostdevs.

Osier




More information about the libvir-list mailing list