[libvirt] Virtqueue size exceeded error when resuming VM
Moshe Levi
moshele at mellanox.com
Mon Aug 8 11:49:57 UTC 2016
Hi,
A new security fix [1],[2] and [3] merged to qemu.
After updating the packages we started to get "qemu-system-x86_64: Virtqueue size exceeded", when resuming the guest.
Our environment is OpenStack master and we have Mellanox CI that test SR-IOV functionality.
Ubuntu 14.04 with Qemu 2.0.0+dfsg-2ubuntu1.26 that contains the fixes see [2]
ii qemu-kvm 2.0.0+dfsg-2ubuntu1.26 amd64 QEMU Full virtualization
ii qemu-system-x86 2.0.0+dfsg-2ubuntu1.26 amd64 QEMU full system emulation binaries (x86)
ii qemu-utils 2.0.0+dfsg-2ubuntu1.26 amd64 QEMU utilities
Our CI started to fail last week when this security packages released.
The scenarios is as follows (sorry for the OpenStack commands :)) :
1. nova boot guest
2. nova suspend guest
3. nova resume guest
The result is that the guest is in poweroff state and when I power it on everything is working fine.
I tested in direct port (SR-IOV) and normal port (virtual port) and it happens in both cases.
According to the [3] it prevent from malicious guest to submit more requests than the virtqueuesize permits.
Our CI uses proprietary Cirros image with mlnx4_en driver. (http://13.69.151.247/images/mellanox_eth.img)
I started to test it with other images to see if the problem with our image.
I also tested with Ubuntu image - https://cloud-images.ubuntu.com/wily/current/wily-server-cloudimg-amd64-disk1.img
And OpenStack Cirros image http://download.cirros-cloud.net/0.3.4/cirros-0.3.4-x86_64-disk.img
The Ubuntu image had the same failure, but the Cirros worked.
I wonder if there is a problem with the patch or with the images?
What in these images can make them malicious guest?
[1] - https://access.redhat.com/security/cve/cve-2016-5403
[2] - http://www.ubuntu.com/usn/usn-3047-1/
[3] - https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg06257.html
More information about the libvir-list
mailing list