[vfio-users] Unbind Vfio Passthrough = general protection fault

Wed Sep 23 11:51:39 UTC 2015

I have found the culprit it was snd-hda-intel module. We can see it in the kernel panic in the end of modules list : 
... usbhid hid drm video [last unloaded: snd_hda_intel]. 
I have 100% vm boot and shutdown cycle success when blacklist! But i don't want to blacklist because i need it (alternate use of the nvidia card on qemu win vm and lxc container full vga passtrough ;-) ). I use fuser /dev/snd/* to detect process and I have trouble to rmmod because there are pulseaudio and mate-setttings-daemon who make mess even when disable autospawn with kernel panic... I have already configured Pulseaudio to stop autodetection full hardware and force only take control of intel graphics hdmi (load-module module-alsa-source device=hw:1,0) but there is a conflict because nvidia card use same module than intel graphics hdmi  my primary card. 
Is there a way to split the snd-hda-intel module or force use another modules to make nvidia audio hdmi easy to bind/unbind ? udev rules?

[  139.723267] snd_hda_intel 0000:00:03.0: enabling device (0000 -> 0002)
[  139.723400] snd_hda_intel 0000:00:03.0: bound 0000:00:02.0 (ops i915_audio_component_bind_ops [i915])
[  139.723427] snd_hda_intel 0000:00:1b.0: enabling device (0000 -> 0002)
[  139.723588] snd_hda_intel 0000:01:00.1: Disabling MSI
[  139.723596] snd_hda_intel 0000:01:00.1: Handle VGA-switcheroo audio client
[  578.212315] snd_hda_intel 0000:00:03.0: IRQ timing workaround is activated for card #0. Suggest a bigger bdl_pos_adj.
[  676.372114] snd_hda_intel 0000:00:03.0: bound 0000:00:02.0 (ops i915_audio_component_bind_ops [i915])
[  680.520332] snd_hda_intel 0000:00:03.0: IRQ timing workaround is activated for card #0. Suggest a bigger bdl_pos_adj.
[ 1074.094060] snd_hda_intel 0000:00:03.0: bound 0000:00:02.0 (ops i915_audio_component_bind_ops [i915])
[ 1078.221733] snd_hda_intel 0000:00:03.0: IRQ timing workaround is activated for card #0. Suggest a bigger bdl_pos_adj.
[ 1457.109006] snd_hda_intel 0000:00:03.0: bound 0000:00:02.0 (ops i915_audio_component_bind_ops [i915])
[ 1461.209577] snd_hda_intel 0000:00:03.0: IRQ timing workaround is activated for card #0. Suggest a bigger bdl_pos_adj.
[ 2125.074735] snd_hda_intel 0000:00:03.0: bound 0000:00:02.0 (ops i915_audio_component_bind_ops [i915])
[ 2129.176143] snd_hda_intel 0000:00:03.0: IRQ timing workaround is activated for card #0. Suggest a bigger bdl_pos_adj

On Tue, Sep 22, 2015 at 11:34 AM, Alex Williamson <alex.l.williamson at gmail.com> wrote:

On Tue, Sep 22, 2015 at 9:26 AM, <nahoxou at netcourrier.com> wrote:

Thanks for withopf deveject good tool who i use as a shutdown task schedule. I have new improvements now on xserver :
In a first host boot i get 4/5 vm cycles success.
Second host boot same 4/5 result!
The 5th attempt fail in both case.
Another idea please?
Well we're almost there...

What does the xserver have to do with anything?  If I just blacklist nouveau and let libvirt bind and unbind devices to vfio-pci around starting the VM, I don't have any problems.  Even if you were to attempt to unbind the devices from vfio-pci while they're in use by QEMU, the unbind should be blocked until QEMU releases the devices. 

Ok, I think I've seen it once, I'll see if I can make it happen again.  This shouldn't be happening, but... if your goal is only to avoid the oops, why not just leave the device bound to vfio-pci?  Generally we try to avoid host graphics drivers since they behave poorly when trying to unbind in-use devices.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/vfio-users/attachments/20150923/4975e6d2/attachment.htm>