[libvirt] [PATCH v4 0/8] bulk snapshot list/redefine (incremental backup saga)

Nir Soffer nsoffer at redhat.com
Tue Mar 12 21:35:23 UTC 2019


On Tue, Mar 12, 2019 at 11:02 PM Eric Blake <eblake at redhat.com> wrote:

> On 3/12/19 3:29 PM, Nir Soffer wrote:
>
> >>>   snapshots =
> >> srcdom.ListAllSnapshots(libvirt.VIR_DOMAIN_SNAPSHOT_LIST_TOPOLOGICAL)
> >>>   for snapshot in snapshts:
> >>>       xml = snapshot.XMLDesc()
> >>>       dstdom.SnapshotCreateXML(xml)
> >>>
> >>> I don't really see a reason why this is so hard that it justifies
> >>> adding these bulk snapshot list/define  APIs.
> >>
> >> Nir, do you want to chime in here?
> >>
> >
> > We don't have a need to list or define snapshots since we managed
> snapshots
> > on oVirt side.
> > We want an API to list and redefine checkpoints.
>
> But the proposed <domaincheckpoint> also has a <domain> subelement, so
> it has the same problem (the XML for a bulk operation can become
> prohibitively large).
>

Why do we need <domain> in a checkpoint?

You can see here what we store in a checkpoint (based on your patches v2 or
v3):
https://ovirt.org/develop/release-management/features/storage/incremental-backup.html#engine-database-changes

Which is:

vm_checkpoints table

- checkpoint_id: UUID
- parent: UUID
- vm_id: UUID
- creation_date: TIMESTAMP

vm_checkpoint_disk_map table
- disk_id: UUID
- checkpoint_id: UUID

>
> > Our use case is:
> >
> > 1. Start VM on some host
> > 2. When oVirt starts a backup, it adds a checkpoint in oVirt database
>
> And does this database preserve topological ordering? If not, why not?
>

Using the parent UUID we can create the correct chain in the right order,
and this is what we send to vdsm. Based on this, vdsm will generate the XML
for libvirt.

There is no domain info related to these checkpoints, and I hope we are not
adding such requirements at this stage.


> > 3. When libvirt starts a backup, libvirt add the checkpoint oVirt created
> > 4. Steps 2 and 3 are repeated while the VM is running...
> > 5. Stop VM - at this point we undefine the VM, so libivrt lost all info
> > about checkopints
> > 6. Start VM on some host
> > 7. oVirt use the bulk checkpoint API to redefine all checkpoints in oVirt
> > database
> > 8. If there is unknown bitmap on a disk, or a bitmap is missing, it means
> > that someone
> >    modified the disk behind oVirt back, and oVirt will delete the
> affected
> > checkpoints
> >    and use libvirt checkpoints API to delete the checkpoints and bitmaps.
> > This forces
> >   full backup for the next backup of the affected disks.
> >
> > We would like that libvirt will fail to redefine the checkpoints if the
> > info in the checkpoints
> > does not match the the bitmap on the disks mentioned in the checkpoints.
>
> That's a different feature request - you're asking for
> virDomainCheckpointCreateXML(REDEFINE) to have some additional flag
> where the metadata re-creation fails if the underlying bitmap is
> corrupt/missing. Which may still be a good flag to add, but is something
> you can do with one-at-a-time checkpoint redefinitions, and doesn't, in
> itself, require bulk redefinition.
>

Checking for unknown bitmap on disk is not possible unless you have the
complete
list of checkpoints. Unless you add something like:

    virDomainCheckpointValidate()

Which will fail if the checkpoints known to libvirt do not match the
bitmaps in the
underlying qcow2 images.

But even if we add this, the user will have to do:

    try:
        for checkpoint in orderdered_checkpoints:
           generate xml...
           redefine checkpoint...
    except ...:
        remove all checkpoints...

    validate checkpoints

Unless the xml for single checkpoint is big, I don't see why all libvirt
users
need to redefine checkopoints one by one and handle all the possible errors
(e.g. some checkpoints redefined, some not).

Note that vdsm may be killed in the middle of the redefine loop, and in
this case
we leave livbirt with partial info about checkpoints, and we need to
redefine
the checkpoints again handling this partial sate.

We means all libvirt clients that need to managed more than one host.

> To
> > do this,
> > libvirt must have the complete checkpoint state. It cannot if we define
> > checkpoints one
> > by one.
>
> As long as you redefine checkpoints in topological order, I don't see
> how defining one at a time or bulk changes whether libvirt can validate
> that the checkpoints are correct.  And even then, redefining in
> topological order is only necessary so that libvirt can ensure that any
> <parent/> checkpoint exists before attempting to define the metadata for
> a child checkpoint.
>
> >
> > If libivirt cannot fail, we need to a way to list all checkpoints and
> > bitmaps, so we can
> > sync oVirt database with reality.
> >
> > If we don't validate bitmaps with the expected bitmaps, the backup may
> fail
> > when starting
> > a backup (good), or succeed, including wrong data in the backup.
> >
> > It also does not make sense to redefine checkpoint state in incremental
> > way, this like
> > building a VM XML with many devices with one call per device.
>
> The argument here is that it will be the same amount of XML whether it
> is one call per checkpoint or one call for all checkpoints (note - there
> are multiple devices in one checkpoint, so it is NOT one call per
> 'device * checkpoint')
>
> >
> > The same checkpoint management will have to be duplicated in any libvirt
> > client that
> > manage more then one host (e.g. KubeVirt, OpenStack), since libvirt does
> > not handle
> > more than one host.
>
> But again, if you are moving checkpoint metadata from one host to
> another, doing so one-at-a-time in topological order shouldn't be much
> harder than doing so in one single call, but without the drawbacks of
> exceeding XML size over RPC call limits.
>
> >
> > For more info see
> > -
> >
> https://ovirt.org/develop/release-management/features/storage/incremental-backup.html
> > -
> https://github.com/oVirt/vdsm/blob/master/lib/vdsm/api/vdsm-api.yml#L11525
> > -
> >
> https://github.com/oVirt/vdsm/blob/0874b89a19af2d7726ed2f0b9d674cf0bb48a67e/lib/vdsm/api/vdsm-api.yml#L8129
> >
>
> >> Unless Nir speaks up with strong reasons in favor of new bulk APIs
> >> (rather than living with the new topological sort flags), I'm okay
> >> dropping this series.
> >
> >
> > I'm ok with dropping bulk snapshots, but I think libvirt should provide
> > bulk APIs for checkpoints.
> >
> > Nir
> >
>
> Given that the same amount of XML is involved across RPC, I'm
> hard-pressed to include bulk operations on one object without providing
> it on both.
>
> --
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.           +1-919-301-3226
> Virtualization:  qemu.org | libvirt.org
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20190312/6e127111/attachment-0001.htm>


More information about the libvir-list mailing list