[libvirt] [PATCH] qemu: Fix shutdown regression

Eric Blake eblake at redhat.com
Tue Sep 20 18:19:37 UTC 2011


On 09/20/2011 12:06 PM, Dave Allan wrote:
> On Tue, Sep 20, 2011 at 07:39:15PM +0200, Jiri Denemark wrote:
>> The commit that prevents disk corruption on domain shutdown
>> (96fc4784177ecb70357518fa863442455e45ad0e) causes regression with QEMU
>> 0.14.* and 0.15.* because of a regression bug in QEMU that was fixed
>> only recently in QEMU git. With affected QEMU binaries, domains cannot
>> be shutdown properly and stay in a paused state. This patch tries to
>> avoid this by sending SIGKILL to 0.1[45].* QEMU processes. Though we
>> wait a bit more between sending SIGTERM and SIGKILL to reduce the
>> possibility of virtual disk corruption.
>
> IMO, SIGKILL should only be sent at the explicit direction of the
> user, saying in effect, I'm ok with possible data corruption, I want
> the VM killed unconditionally.  I would rather leave VMs paused than
> risk corrupting data.  Let's get as much input as we can from the qemu
> folks before we go down this path.

That re-echos my sentiment that qemu needs to tell us whether the bug is 
fixed (we know that if version < 0.14, the bug is not present, and if 
version > 0.15, the bug is fixed, but it is the 0.1[45] window where we 
don't know if the vendor has back-ported the fix into the version of 
qemu that we are targetting, unless we get some help from qemu).

I also wonder if we should make it so:

virDomainDestroy(dom) fails with a reasonable message, rather than 
leaving the domain paused, if we think qemu has the bug, and require the 
user to do virDomainDestroyFlags(dom, VIR_DOMAIN_DESTROY_FORCE) as the 
means of the user explicitly requesting that they work around the qemu bug.

-- 
Eric Blake   eblake at redhat.com    +1-801-349-2682
Libvirt virtualization library http://libvirt.org




More information about the libvir-list mailing list