[libvirt] Notes: Non-shared storage live migration w/ active blockcommit

Eric Blake eblake at redhat.com
Tue Oct 7 23:35:00 UTC 2014


On 09/25/2014 08:26 AM, Kashyap Chamarthy wrote:
> This notes is based on an IRC conversation with Eric Blake, to have
> efficient non-shared storage live migration. Thought I'd post my notes
> here before I forget. Please review and spot if there are any
> inaccuracies.
> 
> Procedure
> ---------
> 
> (1) Starting from disk A, create a snapshot A <- A':
>         
>     $ virsh snapshot-create-as \
>         --domain f20vm snap1 snap1-desc \
>         --diskspec hda,file=/export/vmimages/A'.qcow2 \
>         --disk-only --atomic

If you are using this snapshot only for the side-effect of growing the
chain, you can add --no-metadata here instead of deleting the snapshot
later when it gets invalidated [1].  Of course, if you pass
--no-metadata, the snapshot name (snap1) and description (snap1-desc)
are no longer important.

> 
> (2) Background copy of A to B:
> 
>     $ virsh blockcopy \
>         --domain vm1 vda /export/vmimages/B.qcow2 \
>         --wait --verbose --shallow \
>         --finish

This step is not quite right.  You are asking for a shallow copy of the
current file for disk 'vda' (that is, A'.qcow2).  But that is NOT the
same as the base A image.  For this step, libvirt does not yet have an
easy way to access the contents of a backing chain of a live domain; you
CAN use 'virsh vol-*' commands to do a background copy from storage
pools, but it may be easier to just resort to normal file system tools:

cp /export/vmimages/A.qcow2 /export/vmimages/B.qcow2

or even rely on storage-array-specific commands to set up a trivial
clone with no real time overhead (for example, some iscsi storage arrays
allow efficient copy-on-write cloning of storage volumes by creating a
new name that shares the same original contents of A.qcow2 as its
starting point; and since we are about to delete A.qcow2 later on, we
never need any actual data copying).

> 
> (3) Create an empty B' with backing file B:
> 
>     $ qemu-img create -f qcow2 -b B.qcow2 \
>         -o backing_fmt=qcow2 B'.qcow2
> 
>     [or]
> 
>     $ virsh vol-create-as default B'.qcow2 1G \
>         --format qcow2 \
>         --backing-vol B.qcow2 --backing-vol-format qcow2 

[side note - we should really teach libvirt to not REQUIRE a size when
creating an empty wrapper around an existing image]

> 
> (4) Do a shallow blockcopy of A' to B':
> 
>     $ virsh blockcopy \
>         --domain vm1 vda /export/vmimages/B'.qcow2 \
>         --wait --verbose --shallow \
>         --finish

For this to work, you need to also use the --reuse-external flag to take
advantage of the backing chain already recorded in B'.qcow2 (without the
flag, the command will complain that B'.qcow2 already exists if it is a
regular file; if it is a block device, it will just silently ignore the
contents of the block device and treat B'.qcow2 as though an absolute
path to A.qcow2 were its backing file).

> 
> (5) Then live shallow commit of B:
> 
>     $ virsh blockcommit \
>         --domain f20vm vda \
>         --wait --verbose --shallow \
>         --pivot --active --finish
>     Block Commit: [100 %]
>     Successfully pivoted

With steps 2 and 4 corrected, this indeed shortens the chain back down
to just B.qcow2.  And once this happens, you no longer need the path to
A.qcow2 or A'.qcow2; you can also delete B'.qcow2.  But back to the
point I made earlier at [1]: if this is all you do, then 'virsh
snapshot-list' will still show 'snap1' as a snapshot that tries to refer
to A'.qcow2; since you just invalidated that with the copy, you'd need
to 'virsh snapshot-delete --metadata vm1 snap1' to get rid of the stale
snapshot (if you don't tweak step 1 to avoid creating that snapshot
metadata in the first place).

The NICE part about this whole sequence is that the backing file does
NOT have to be qcow2, and it is VERY efficient timewise, if you happen
to have an efficient way to do step 2.  That is, I can go from a
multi-gigabyte raw file A.img to raw file B.img in less than a second,
assuming the guest isn't doing much I/O in the meantime, when scripting
all these steps together, and without any guest downtime.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 539 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20141007/24f30c80/attachment-0001.sig>


More information about the libvir-list mailing list