I think I might have just figured out the issue. My modprobe option
for vfio-pci is wrong for the sound card part of the GPU. Should be
1002:aac8, not 1002:aac0. This led to that device not being bound to
vfio-pci, so libvirt would do it automatically when starting/stopping
the vm. Apparently unbinding the sound card part of the GPU after use
does not work well. After fixing the option and rebooting everything,
VM shutdown worked after starting it up and running a benchmark for a
few minutes. So far so good, hopefully this problem is all user error.
Good to hear that. When all things went well, can you compare the performance between 1 GB and 2 MB hugepages? If 1 GB is better (even the slightest), I’d like to try again using 1 GB hugepages.