[vfio-users] GPU crashes after a while with dmesg spam

Alex Williamson alex.williamson at redhat.com
Wed Jun 21 16:47:45 UTC 2017


On Wed, 21 Jun 2017 23:58:58 +0800
Aria <aria at ar1as.space> wrote:

> After a few minutes of gaming, once a significant event happens
> (Someone dies ingame) the screen shuts off and claims there's no
> signal. My dmesg log is spammed with
> [ 2806.613203] vfio_bar_restore: 0000:01:00.0 reset recovery -
> restoring bars [ 2808.169346] vfio_bar_restore: 0000:01:00.1 reset
> recovery - restoring bars
> 
> Running archlinux, kernel 4.11.6-1-ARCH, NVIDIA GTX 970.
> 
> A curious note is the output of lspci once this happens is
> 
> 01:00.0 VGA compatible controller: NVIDIA Corporation GM204 [GeForce
> GTX 970] (rev ff) (prog-if ff) !!! Unknown header type 7f
> 	Kernel driver in use: vfio-pci
> 	Kernel modules: nouveau, nvidia_drm, nvidia
> 
> 01:00.1 Audio device: NVIDIA Corporation GM204 High Definition Audio
> Controller (rev ff) (prog-if ff) !!! Unknown header type 7f
> 	Kernel driver in use: vfio-pci
> 	Kernel modules: snd_hda_intel
> 

This means that the card doesn't show up in PCI config space anymore
(all reads return -1).  That's potentially also why vfio thinks the
device was reset, suddenly the BARs don't contain what we think they
should because reading them returns -1.  Seems like a hardware issue.
Thanks,

Alex




More information about the vfio-users mailing list