[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [libvirt] [PATCH] qemu: Fix shutdown regression



On 09/20/2011 12:06 PM, Dave Allan wrote:
On Tue, Sep 20, 2011 at 07:39:15PM +0200, Jiri Denemark wrote:
The commit that prevents disk corruption on domain shutdown
(96fc4784177ecb70357518fa863442455e45ad0e) causes regression with QEMU
0.14.* and 0.15.* because of a regression bug in QEMU that was fixed
only recently in QEMU git. With affected QEMU binaries, domains cannot
be shutdown properly and stay in a paused state. This patch tries to
avoid this by sending SIGKILL to 0.1[45].* QEMU processes. Though we
wait a bit more between sending SIGTERM and SIGKILL to reduce the
possibility of virtual disk corruption.

IMO, SIGKILL should only be sent at the explicit direction of the
user, saying in effect, I'm ok with possible data corruption, I want
the VM killed unconditionally.  I would rather leave VMs paused than
risk corrupting data.  Let's get as much input as we can from the qemu
folks before we go down this path.

That re-echos my sentiment that qemu needs to tell us whether the bug is fixed (we know that if version < 0.14, the bug is not present, and if version > 0.15, the bug is fixed, but it is the 0.1[45] window where we don't know if the vendor has back-ported the fix into the version of qemu that we are targetting, unless we get some help from qemu).

I also wonder if we should make it so:

virDomainDestroy(dom) fails with a reasonable message, rather than leaving the domain paused, if we think qemu has the bug, and require the user to do virDomainDestroyFlags(dom, VIR_DOMAIN_DESTROY_FORCE) as the means of the user explicitly requesting that they work around the qemu bug.

--
Eric Blake   eblake redhat com    +1-801-349-2682
Libvirt virtualization library http://libvirt.org


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]