[libvirt] [RFC] handling hostdev save/load net config for non SR-IOV devices

Daniel Henrique Barboza danielhb413 at gmail.com
Thu Jul 18 15:56:47 UTC 2019



On 7/18/19 12:29 PM, Laine Stump wrote:
> On 7/18/19 10:29 AM, Daniel Henrique Barboza wrote:
>> Hi,
>>
>> I have a PoC that enables partial coldplug assignment of multifunction
>> PCI devices with managed mode. At this moment, Libvirt can't handle
>> this scenario - the code will detach only the hostdevs from the XML,
>> when in fact the whole IOMMU needs to be detached. This can be
>> verified by the fact that Libvirt handles the unmanaged scenario
>> well, as long as the user detaches the whole IOMMU beforehand.
>>
>> I have played with 2 approaches. The one I am planning to contribute
>> back is a change inside virHostdevGetPCIHostDeviceList(), that
>> adds the extra PCI devices for detach/re-attach in case a PCI
>> Multifunction device in managed mode is presented in the XML.
>
>
> If you're thinking of doing that automatically, then I should warn you 
> that we had discussed that a long time ago, and decided that it was a 
> bad idea to do it because it was likely someone would, e.g. try to 
> assign an audio device to their guest that happened to be one function 
> on a multifunction device that also contained a disk controller (or 
> some other device) that the host needed for proper operation.
>
>

Let's say that I have a Multi PCI card with 4 functions, and I want a 
guest to use
only the function 0 of that card. At this moment, I'm only able to do 
that if I
manually execute nodedev-detach on all 4 functions beforehand and use 
function
0 as a hostdev with managed=false.

What I've implemented is a way of doing the detach/re-attach of the 
whole IOMMU
for the user, if the hostdev is set with managed=true (and perhaps I 
should also
consider verifying the 'multifunction=yes' attribute as well, for more 
clarity).
I am not trying to assign all the IOMMU devices to the guest - not sure 
if that's
what you were talking about up there, but I'm happy to emphasize that's not
the case.

Now, yes, if the user is unaware of the consequences of detaching all 
devices
of the IOMMU from the host, bad things can happen. If that's what you're 
saying,
fair enough. I can make an argument about how we can't shield the user from
his/her own 'unawareness' forever, but in the end it's better to be on 
the safe
side.


> It may be that in *your* particular case, you understand that the 
> functions you don't want to assign to the guest are not otherwise 
> used, and it's not dangerous to suddenly detach them from their host 
> driver. But you can't assume that will always be the case.
>
>
> If you *really* can't accept just assigning all the devices in that 
> IOMMU group to the guest (thus making them all explicitly listed in 
> the config, and obvious to the administrator that they won't be 
> available on the host) and simply not using them, then you either need 
> to separately detach those particular functions from the host, or come 
> up with a way of having the domain config explicitly list them as 
> "detached from the host but not actually attached to the guest".
>

I can live with that - it will automate the detach/re-attach process, 
which is
my goal here, and it force the user to know exactly what is going to be 
detached
from the host, minimizing errors. If no one is against adding an extra
parameter 'unassigned=true' to the hostdev in these cases, I can make this
happen.


Thanks,


DHB



>
>
>> Now, there's a catch. Inside both virHostdevPreparePCIDevices()
>> and virHostdevReAttachPCIDevices() there are code to save/restore
>> the network configuration for SR-IOV devices. These functions iterates
>> in the hostdevs list, instead of the pcidevs list I'm changing. The final
>> result, given that the current conditions used for SR-IOV matches the
>> conditions for multifunction PCI devices as well, is that not all virtual
>> functions will get their network configuration saved/restored.
>
>
> If you're not going to use a device (which is implied by the fact that 
> it's not in the hostdevs list) then nothing about its network config 
> will change, so there is no reason to save/restore it.
>
>
>>
>> For example, a guest that uses 3 of 4 functions of a PCI MultiFunction
>> card, let's say functions 0,1 and 3. The code will handle the detach
>> of all the IOMMU, including the function 2 that isn't declared in the
>> XML.
>
>
> Again, the above sentence implies that you're wanting to make this 
> completely automatic, which we previously decided was something we 
> didn't want to do.
>
>
>> However, since function 2 isn't a hostdev, its network config
>> will not be restored after the VM shutdown.
>
>
> You're talking about something that will never occur - on every SRIOV 
> network card I've ever seen each VF is in its own IOMMU group, and can 
> be assigned to a guest independent of what's done with any other VF. 
> I've never seen a case (except maybe once with a newly released 
> motherboard that had broken IOMMU firmware(?)) where a VF was in the 
> same IOMMU group as any other device.
>
>
>>
>> Now comes the question: how much effort should be spent into making
>> the network config of all the functions be restored? Is this a blocker
>> for the whole code to be accepted or, given it is proper documented
>> somewhere, it can be done later on?
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20190718/f6bd4267/attachment-0001.htm>


More information about the libvir-list mailing list