[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [libvirt] [PATCH] Avoid a race when restoring a qemu domain.



On 04/07/2010 02:39 PM, Laine Stump wrote:
> This patch adds a 1 second sleep after telling qemu to start the
> restore operation and before telling qemu to start up the
> CPUs. Without this sleep, my hardware would end up with the CPUs
> started before the restore was started, leading to random (but never
> good) behavior. Apparently this is caused by slow hardware, as I
> haven't heard of anyone else experiencing this problem.
> 
> A sleep is a very inelegant way to eliminate the problem, but it's
> apparently the only way currently available to us.
> 
> Note that sleep durations as low as 250msec were successful in
> eliminating the bad behavior; I made it 1 sec. just for extra safety.
> ---
>  src/qemu/qemu_driver.c |    7 +++++++
>  1 files changed, 7 insertions(+), 0 deletions(-)
> 
> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> index 60fa95a..1270c84 100644
> --- a/src/qemu/qemu_driver.c
> +++ b/src/qemu/qemu_driver.c
> @@ -5965,6 +5965,13 @@ static int qemudDomainRestore(virConnectPtr conn,
>      /* If it was running before, resume it now. */
>      if (header.was_running) {
>          qemuDomainObjPrivatePtr priv = vm->privateData;
> +
> +        /* pause 1 second to allow qemu time to start the restore,
> +         * otherwise it may start the CPUs before the restore, and end
> +         * up in a "nondeterminate" state.
> +         */
> +        usleep(1000000);
> +
>          qemuDomainObjEnterMonitorWithDriver(driver, vm);
>          if (qemuMonitorStartCPUs(priv->mon, conn) < 0) {
>              if (virGetLastError() == NULL)

Hm, this really doesn't seem like it's the way to fix this.  We really
should investigate what is going on in qemu, and see if it's a bug in
qemu itself (in which case we should fix qemu), or if it's a bug in the
way we communicate with qemu (in which case we should fix that).  A
sleep is just hiding the problem (which means it can still pop up on
machines slower, or more busy, than yours!).

-- 
Chris Lalancette


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]