[libvirt] [PATCH 4/6] hostdev: Mark PCI devices as inactive as they're detached

Andrea Bolognani abologna at redhat.com
Fri Dec 18 15:54:27 UTC 2015


Very quick reply as I have a bus to Italy to catch in a few hours :)

On Fri, 2015-12-18 at 10:37 -0500, Laine Stump wrote:
> On 12/17/2015 10:23 AM, Andrea Bolognani wrote:
> > We want to eventually factor out the code dealing with device detaching
> > and reattaching, so that we can share it and make sure it's called eg.
> > when 'virsh nodedev-detach' is used.
>> > For that to happen, it's important that the lists of active and inactive
> > PCI devices are updated every time a device changes its state.
> 
> We need to know which devices from an iommu group were unbound from the 
> host driver by us, so that we'll only re-bind those that we had unbound 
> in the first place. So I can see the reasoning for wanting the inactive 
> list to always hold the complete list of devices that libvirt has 
> detached from the host driver, but that aren't at the moment in use by 
> any domain.

In general, we're doing an okay job at keeping track of active / inactive
devices, but the handling is all over the place which makes it kinda
difficult to split off chunks of code for reuse.

> In this case that you're fixing, though, it's only a temporary 
> inconsistency, since "Loop 6" of that same function will add the 
> detached devices to the inactive list.

I'm aware of that, but once the "detach devices" and "reattach devices"
parts have been split off for reuse we can no longer count on it :)

> However, when I trace down into the one use of the inactive list between 
> Loop 2 (where you're adding the devices in this patch) and Loop 6 (where 
> they were previously added), I've found that your patch may actually be 
> fixing a latent bug, but only in the case where someone is using the 
> legacy KVM device assignment - if virPCIDeviceReset() (which returns 
> with no action when vfio-pci is used) is unsuccessful at resetting the 
> device individually, it will try resetting the entire bus the device is 
> on using virPCIDeviceTrySecondaryBusReset(), but it can't do this unless 
> all the other devices on the bus are unused. This is determined by 
> calling virPCIDeviceBusContainsActiveDevices(), which iterates through 
> every PCI device on the host calling virPCIDeviceSharesBusWithActive(). 
> That function will return true if it finds a device on the same bus that 
> *isn't* on the inactive list. So if a device is on a bus with multiple 
> devices, those devices have all been assigned to the guest using legacy 
> KVM assignment, and the normal device reset functions don't work for 
> them, the current code would fail, while yours would succeed. *Highly* 
> unlikely (and we don't really care, since legacy KVM device assignment 
> is, well, legacy; deprecated; soon to go away; "pining for the fjords" 
> as it were :-)

I see why that would be a problem. I believe that, in general, the more
explicit we are about marking devices as active or inactive, the less
likely it will be to end up in such a situation.

> (sorry for the digression. I spent so much time investigating that it 
> didn't feel right to just conclude the search by saying "nothing here. 
> Move on!)

Doesn't look like something you need to apologize for :)

> Anyway I see no problem caused by this patch, so ACK.
> 
> I guess you also intend to begin storing the inactive device list 
> somewhere so that it can be reread if libvirtd is restarted?

I haven't really tackled the problem yet, but the more I think about
it the more I believe we need to.

We could of course figure out at least a good part of the information
we need by looking at the system state, but we wouldn't be able to eg.
tell whether a device that's bound to vfio-pci was bound to a host
driver before being detached.

Cheers.

-- 
Andrea Bolognani
Software Engineer - Virtualization Team




More information about the libvir-list mailing list