[libvirt-users] Re: managedsave results in unexpected shutdown from inside Windows

Fri Mar 15 12:17:45 UTC 2013

The 14/03/13, Eric Blake wrote:
> On 03/14/2013 06:29 AM, Nicolas Sebrecht wrote:
> > The 13/03/13, Eric Blake wrote:
> > 
> >> You might want to look into external snapshots as a more efficient way
> >> of taking guest snapshots.
> > 
> > I have guests with raw disks due to Windows performance issues. It would
> > very welcome to have minimal downtime as some disks are quiet large
> > (terabytes) and the allowed downtime window very short. Let's try external
> > snapshots for guest "VM" while running:
> 
> Do be aware that an external snapshot means you are no longer using a
> raw image - it forces you to use a qcow2 file that wraps your raw image.

Yes, that's what I understood from the man pages. This would not be a
problem as long as it would be a temporary case while doing the backups.

To resume the context for further readers from the archives, the idea is
to use external snapshots in order to have minimal downtime, instead of
using managedsave (aka hibernate). This is possible even if not all
features are not implemented in libvirt for you (depends of the original
disk format and the development state).

Here are the basics steps. This is still not that simple and there are
tricky parts in the way.

Usual workflow (use case 2)
===========================

Step 1: create external snapshot for all VM disks (includes VM state).
Step 2: do the backups manually while the VM is still running (original disks and memory state).
Step 3: save and halt the vm state once backups are finished.
Step 4: merge the snapshots (qcow2 disk wrappers) back to their backing file.
Step 5: start the VM.

Restarting from the backup (use case 1)
=======================================

Step A: shutdown the running VM and move it out the way.
Step B: restore the backing files and state file from the archives of step 2.
Step C: restore the VM. (still not sure on that one, see below)

I wish to provide a more detailed procedure in the future.

> With new enough libvirt and qemu, it is also possible to use 'virsh
> blockcopy' instead of snapshots as a backup mechanism, and THAT works
> with raw images without forcing your VM to use qcow2.  But right now, it
> only works with transient guests (getting it to work for persistent
> guests requires a persistent bitmap feature that has been proposed for
> qemu 1.5, along with more libvirt work to take advantage of persistent
> bitmaps).

Fine. Sadly, my guests are not transient.
It appears I'm in worst case for all options.  :-)

> There's also a proposal on the qemu lists to add a block-backup job,
> which I would need to expose in libvirt, which has even nicer backup
> semantics than blockcopy, and does not need a persistent bitmap.

Ok.

> > Surprising! I would have expect files to be stored in virtuals/images. This is
> > not the point for now, let's continue.
> 
> Actually, it would probably be better if libvirtd errored out on
> relative path names (relative to what? libvirtd runs in '/', and has no
> idea what directory virsh was running in), and therefore virsh should be
> nice and convert names to absolute before handing them to libvirtd.

Ok. I guess an error for relative paths would be fine to avoid
unexpected paths. All embedded console I know support relative path
(e.g.: python, irb, rails console, etc).

> > USE CASE 1: restoring from backing file
> > =======================================

<...>

> Correct - we still don't have 'snapshot-revert' wired up in libvirt to
> revert to an external snapshot - we have ideas on what needs to happen,
> but it will take time to get that code into the code base.  So for now,
> you have to do that manually.

Fine.

> >   # virsh restore /VM.save
> >   Domain restored from /VM.save
> 
> Hmm.  This restored the memory state from the point at which the
> snapshot was taken, but unless you were careful to check that the saved
> state referred to the base file name and not the just-created qcow2
> wrapper from when you took the snapshot, then your disks might be in an
> inconsistent state with the memory you are loading.  Not good.  Also,
> restoring from the base image means that you are invalidating the
> contents of the qcow2 file for everything that took place after the
> snapshot was taken.

Wait, wait, wait. Here, I want to restore my backup.

So, I use the first created memory state in use case 2 with the
snapshot-create-as command. Sorry, I sucked at using use case numbers in
chronogical order. I keep the error to not add complexity.

>From what I've checked with save-image-edit, this memory state points to the VM.raw disk
(what I would expect).

Here is where we are in the workflow (step C) for what we are talking about:

Step 1: create external snapshot for all VM disks (includes VM state).
Step 2: do the backups manually while the VM is still running (original disks and memory state).
Step 3: save and halt the vm state once backups are finished.
Step 4: merge the snapshots (qcow2 disk wrappers) back to their backing file.
Step 5: start the VM.
<For whatever reason, I have to restore the backup from step 2>
Step A: shutdown the running VM and move it out the way.
Step B: restore the backing files and state file from the archives of step 2.
Step C: restore the VM.

So, yes: this is the memory state from the point at which the snapshot
was taken but I clearly expect it to point to the backing file only.

> Yeah, again a known limitation.  Once you change state behind libvirt's
> back (because libvirt doesn't yet have snapshot-revert wired up to do
> things properly), you generally have to 'virsh snapshot-delete
> --metadata VM snap1' to tell libvirt to forget the snapshot existed, but
> without trying to delete any files, since you did the file deletion
> manually.

Good, this is what I was missing.

> You have to delete /VM.save and /VM-snap1.img yourself, but you should
> have used 'virsh snapshot-delete --metadata' instead of mucking around
> in /var/lib (that directory should not be managed manually).

Ok.

> > USE CASE 2: the files are saved in another place, let's merge back the changes
> > ==============================================================================
> > 
> > The idea is to merge VM-snap1.img back to VM.raw with minimal downtime. I can't
> > find a command for that, let's try manually.
> 
> Here, qemu is at fault.  They have not yet given us a command to do that
> with minimal downtime.  They HAVE given us 'virsh blockcommit', but it
> is currently limited to reducing chains of length 3 or longer to chains
> of at least 2.  It IS possible to merge back into a single file while
> the guest remains live, by using 'virsh blockpull', but that single file
> will end up being qcow2; and it takes the time proportional to the size
> of the entire disk, rather than to the size of the changes since the
> snapshot was taken.  Again, here's hoping that qemu 1.5 gives us live
> commit support, for taking a chain of 2 down to a single raw image.

Ok, good to know.

> You went behind libvirt's back and removed /VM-snap1.img, but failed to
> update the managedsave image to record the location of the new filename.

<...>

> Yes, but you still have the managedsave image in the way.

Right.

> >   # virsh start VM
> >   error: Failed to start domain VM
> >   error: cannot open file 'VM-snap1.img': No such file or directory
> 
> Try 'virsh managedsave-remove VM' to get the broken managedsave image
> out of the way.

Well, no. I would expect to come back to the exact same environment as
after the backup. To do so, I expect to be able to do steps 3, 4 and 5
cleanly.

Step 3: save and halt the vm state once backups are finished.
Step 4: merge the snapshots (qcow2 disk wrappers) back to their backing file.
Step 5: start the VM.

>                  Or, if you are brave, and insist on rebooting from the
> memory state at which the managedsave image was taken but are sure you
> have tweaked the disks correctly to match the same point in time, then
> you can use 'virsh save-image-edit /path/to/managedsave' (assuming you
> know where to look in /etc to find where the managedsave file was stored
> internally by libvirt).  Since modifying files in /etc is not typically
> recommended, I will assume that if you can find the right file to edit,
> you are already brave enough to take on the consequences of going behind
> libvirt's back.  At any rate, editing the managed save image to point
> back to the correct raw file name, followed by 'virsh start', will let
> you resume with memory restored to the point of your managed save (and
> hopefully you pointed the disks to the same point in time).

Exactly.

> > Looks like the complain comes from the xml state header.
> > 
> >   # virsh save-image-edit /var/lib/libvirt/qemu/save/VM.save
> >   <virsh edit VM to come back to vda -> VM.raw>
> >   error: operation failed : new xml too large to fit in file
> >   #
> 
> Aha - so you ARE brave, and DID try to edit the managedsave file.  I'm
> surprised that you really hit a case where your edits pushed the XML
> over a 4096-byte boundary.  Can you come up with a way to (temporarily)
> use shorter names, such as having /VM-snap1.img be a symlink to the real
> file, just long enough for you to get the domain booted again?

Excellent. I don't know why I didn't think about trying that. Tested and
the symlink trick works fine. I had to change the disk format in the
memory header, of course.

BTW, I guess I can prevent that by giving absolute path for the
snapshot longer than the original disk path.

>                                                                 Also, I
> hope that you did your experimentation on a throwaway VM, and not on a
> production one, in case you did manage to fubar things to the point of
> data corruption by mismatching disk state vs. memory state.

I did everything in a testing environment where breaking guests or
hypervisor does not matter.

> Yes, I know that reverting to snapshots is still very much a work in
> progress in libvirt, and that you are not the first to ask these sorts
> of questions (reading the list archives will show that this topic comes
> up quite frequently).

While this is WIP and as users regulary ask for that I wonder if it would
worth writing a simple and dirty python script for others. What I would
provide would ask each one to hack the script for its own environment.

Thanks a lot for your patience and insteresting information.

-- 
Nicolas Sebrecht