[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [libvirt] Silently ignored virDomainRestore failures



On Mon, Sep 28, 2009 at 6:43 AM, Daniel P. Berrange <berrange redhat com> wrote:
The flaw in QEMU is depressingly obvious

  static int stdio_pclose(void *opaque)
  {
     QEMUFileStdio *s = opaque;
     pclose(s->stdio_file);
     qemu_free(s);
     return 0;
  }

Notice how it completely discards the exit status returned by
pclone() and just pretends everything always worked :-(

If this was handling errors correctly, you'd at least see  QEMU
exiting rather than hanging around broken.

Ugh, indeed. I'll submit a patch for that later today, if nobody beats me to it.
 
Hmm, this does look problematic - we need the monitor to be responsive
in order to do things like CPU pinning. We need the monitor to be
non-responsive to ensure 'cont' doesn't run until migration has finished.
We can't have it both ways, and the former wins since we need that to be
done before ever letting QEMU start allocating guest RAM pages. So relying
on 'cont' to block is not good.  Is the 'cont' even neccessary - I remember
seeing somewhere that QEMU unconditionally started its CPUs after an
incoming migraiton finished ?

I've seen patches to change that behavior, so IMHO it's probably not to safe to depend on it being one way or the other throughout the versions of qemu libvirt supports.

What I'm tempted to do is add a command which sends a sigil to stderr to the end of the exec: migration lines specified by libvirt, and wait for either that sigil or an error to show up in the log for that domain before issuing the cont; if my memory is at all correct, libvirt should have some helper functions useful for that purpose already available.

Does this sound like a reasonable approach?

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]