[libvirt] [PATCHv2 2/2] qemu: increase the timeout before sending SIGKILL to qemu process

Eric Blake eblake at redhat.com
Fri Feb 3 17:06:40 UTC 2012


On 02/03/2012 01:24 AM, Daniel Veillard wrote:
> On Thu, Feb 02, 2012 at 12:54:29PM -0500, Laine Stump wrote:
>> The current default method of terminating the qemu process is to send
>> a SIGTERM, wait for up to 1.6 seconds for it to cleanly shutdown, then
>> send a SIGKILL and wait for up to 1.4 seconds more for the process to
>> terminate. This is problematic because occasionally 1.6 seconds is not
>> long enough for the qemu process to flush its disk buffers, so the
>> guest's disk ends up in an inconsistent state.
>>
> 
>   On the semantic of the patch, it does what it suggest ACK to this

Agreed.

> But that's unfortunately a pure heuristic, when the domain doesn't
> fail to stop gracefully, there is no problem and this doesn't change
> anything. If the domain is doing intensive I/Os flushing buffers for
> example the extra grace period may help but there is absolutely no
> garantee. On linux we could try to be a bit smart and detect completely
> stuck guests by looking at /proc/$pid/io rchar and wchar if that doesn't
> move at all in the iterations we can probably consider it dead, if
> it does well we can be pretty sure that SIGKILL will loose data :-\

That would be a Linux-specific heuristic.  It might even be worth adding
more flags as we come up with more heuristics, to allow the user to
control which heuristic to attempt (and fail if a particular heuristic
is not supported for a particular host), but I think that those could be
later patches.

> 
>   ACK at this heuristic attempt but maybe a smarter algorithm is
> in order, I'm sure others will comment :-)

I'm in favor of this patch going in now; as you argued, it is a no-op
change in the common success case, and a reliability fix (even if
slower) in the case where it would have been giving up too early
previously, all to benefit applications that haven't yet been adjusted
to take advantage of the new flags.

Speaking of applications that should take advantage of new flags,
where's patch 3/2 to teach virsh destroy how to use the new flag?

-- 
Eric Blake   eblake at redhat.com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 620 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20120203/cfe006b2/attachment-0001.sig>


More information about the libvir-list mailing list