[libvirt] [PATCH v4 05/10] Split reprobe action from the virPCIUnbindFromStub into a new function

Alex Williamson alex.williamson at redhat.com
Fri Nov 20 18:00:27 UTC 2015


On Fri, 2015-11-20 at 12:24 -0500, Laine Stump wrote:
> On 11/20/2015 11:58 AM, Andrea Bolognani wrote:
> > On Fri, 2015-11-20 at 11:33 -0500, Laine Stump wrote:
> >> Seems safe, but is this really what we want to do? I haven't
> >> read/understood the remaining patches yet, but this makes it sound like
> >> what is going to happen is that all of the devices will be unbound from
> >> vfio-pci immediately, so they are "in limbo", and will then be reprobed
> >> once all devices are unused (and therefore unbound from vfio-pci).
> >>
> >> I think that may be a bit dangerous. Instead, we should leave the
> >> devices bound to vfio-pci until all of them are unused, and at that
> >> time, we should unbind them all from vfio-pci, then reprobe them all.
> >> (again, I may have misunderstood the direction, if so ignore this).
> > I agree, we should not unbind any device from vfio-pci until
> > all the devices in the IOMMU group have been detached from
> > the guest.
> 
> ... and I've just looked back at my original comment about this in the 
> BZ, and see that at that time I only suggested delaying the reprobe, but 
> said nothing about delaying the unbind. And I'm not as sure about the 
> necessity of waiting as I was 1/2 an hour ago. I suppose the issue is 
> that it brings all those unbound devices one step closer to getting 
> bound to the host driver. However, that will happen only if those 
> device's PCI addresses are written to "drivers_reprobe" in sysfs (right? 
> is there any other way a more "global" reprobe could happen and snatch 
> up everything that's currently unbound?)

Any load of a module will snatch up any unclaimed devices that match it,
so if you unbind and leave the devices orpaned, a random module load
could cause much badness.  Adding a new_id will also cause a device
scan, so if that happened to match the device: random badness.

> So maybe I'd better ask someone who knows more about this than me - 
> Alex, is there an issue with unbinding some devices in an iommu group 
> from vfio-pci at an earlier time, and leaving then unbound to any driver 
> at all while some other devices in the group are still in use by the 
> guest? Is there an advantage to keeping them all bound to vfio-pci until 
> none of them are used, and then unbinding/reprobing them all at the same 
> time? Or should we unbind each from vfio-pci immediately when they are 
> detached from the guest, and reprobe them all once they're all unbound?

Unbinding them from vfio-pci leaves them susceptible to random bad
things happen, as outlined above, and potentially limits vfio's ability
to do things like bus resets.  For instance imagine a 2-port NIC where
each port is a PCI function, the functions are grouped together and the
devices don't support any sort of internal reset.  If both devices are
bound to vfio-pci, then the user owns them both and we can do a bus
reset.  If one of those devices gets released from the user, as soon as
it's unbound from vfio-pci it's no longer in our control and the bus
rest option is gone.

The best course of action would be to leave any managed devices bound to
vfio-pci until all of the devices within the group are no longer in use.
Thanks,

Alex




More information about the libvir-list mailing list