[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [libvirt] [Qemu-devel] [PATCH RFC 0/4] Allow hibernation on guests



On 01/30/2012 06:57 AM, Luiz Capitulino wrote:
On Thu, 26 Jan 2012 16:57:01 -0600
Anthony Liguori<anthony codemonkey ws>  wrote:

On 01/26/2012 01:35 PM, Luiz Capitulino wrote:
On Thu, 26 Jan 2012 08:18:03 -0700
Eric Blake<eblake redhat com>   wrote:

[adding qemu-devel]

On 01/26/2012 07:46 AM, Daniel P. Berrange wrote:
One thing, that you'll probably notice is this
'set-support-level' command. Basically, it tells GA what qemu version
is it running on. Ideally, this should be done as soon as
GA starts up. However, that cannot be determined from outside
world as GA doesn't emit any events yet.
Ideally^2 this command should be left out as it should be qemu
who tells its own agent this kind of information.
Anyway, I was going to call this command in qemuProcess{Startup,
Reconnect,Attach}, but it won't work. We need to un-pause guest CPUs
so guest can boot and start GA, but that implies returning from qemuProcess*.

So I am setting this just before 'guest-suspend' command, as
there is one more thing about GA. It is unable to remember anything
upon its restart (GA process). Which has BTW show flaw
in our current code with FS freeze&   thaw. If we freeze guest
FS, and somebody restart GA, the simple FS Thaw will not succeed as
GA thinks FS are not frozen. But that's a different cup of tea.

Because of what written above, we need to call set-level
on every suspend.


IMHO all this says that the 'set-level' command is a conceptually
unfixably broken design&   should be killed in QEMU before it turns
into an even bigger mess.

Can you elaborate on this? Michal and I talked on irc about making the
compatibility level persistent, would that help?

Once we're in a situation where we need to call 'set-level' prior
to every single invocation, you might as well just allow the QEMU
version number to be passed in directly as an arg to the command
you are running directly thus avoiding this horrificness.

Qemu folks, would you care to chime in on this?

Exactly how is the set-level command supposed to work?  As I understand
it, the goal is that if the guest has qemu-ga 1.1 installed, but is
being run by qemu 1.0, then we want to ensure that any guest agent
command supported by qemu-ga 1.1 but requiring features of qemu not
present in qemu 1.0 will be properly rejected.

Not exactly, the default support of qemu-ga is qemu 1.0. This means that by
default qemu-ga will only support qemu 1.0 even when running on qemu 2.0. This
way the set-support-level command allows you to specify that qemu 2.0 features
are supported.

Version numbers are meaningless.  What happens when a bunch of features get
backported by RHEL such that qemu-ga 1.0 ends up being a frankenstein version of
2.0?

The feature negotiation mechanism we have in QMP is the existence of a command.
   If we're in a position where we're trying to disable part of a command, it
simply means that we should have multiple commands such that we can just remove
the disabled part entirely.

You may have a point that we shouldn't be using the version number for that,
but just switching to multiple commands doesn't solve the fundamental problem.

Agreed, but the multiple commands isn't really the fix here, it's libvirt querying for the "wakeup" command that Gerd's patches add.

We implemented set-version-level as a way to let management tools obliviously report something simple, like the version information it parses from QEMU already, to let the guest agent Do The Right Thing without any deep insight into what it's host requirements are (which is also why we opted for versions over specific capabilities flags).

But we can't rely on version levels applying in all cases, a distro might backport s3 support for 1.0, modify the guest agent version level dependencies accordingly, and thus render the that agent incompatible with, say, RHEL.

So it was a naive approach on my part, and the issues the libvirt folks noted with not knowing if a reset occurred since the last set-support-level make it even less desirable (we could persist it as you suggested, and I am working on patches to persist fsfreeze state, but it's not worth trying to fix set-support-level).

And at least in this case we have an easy out: libvirt knows it can't resume a guest without the presence of a QMP command to do so, and that doesn't require any intimate knowledge of how the agent works, it's just common sense.

Breaking the suspend/hibernate stuff into multiple commands avoids the need for libvirt to special case based on specific parameters to the command. We can also imagine distro-specific implementations of the agent disabling s4 due to things like not having s4 patches for virtio in place, so it makes it easier (possible) to discover those situations as well. It may be that we opt for a single command for libvirt, but at least on the agent side it should be multiple discoverable ones.

Hopefully this doesn't set anyone back too much :( IMHO it's the right way to go though.


The fundamental problem is that, S3 in current (and old) qemu has two known bugs:

  1. The screen is left black after S3 (it's a bug in seabios)
  2. QEMU resumes the guest immediately (Gerd posted patches to address this)

We're going to address both issues in 1.1. However, if qemu-ga is installed in
an old qemu and S3 is used, the bugs will be triggered.

We need a way for qemu-ga to query qemu about the existence of a working S3
support. The set-support-level solves that.

Another option would be to disable (or enable) S3 by default in qemu-ga, and let
the admin enable (or disable it) according to S3 support being fixed in qemu.



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]