[libvirt RFC 20/24] qemu_snapshot: prepare data for external snapshot deletion

Peter Krempa pkrempa at redhat.com
Tue Sep 6 10:19:38 UTC 2022


On Tue, Aug 23, 2022 at 18:32:23 +0200, Pavel Hrdina wrote:
> In order to save some CPU cycles we will collect all the necessary data
> to delete external snapshot before we even start. They will be later
> used by code that deletes the snapshots and updates metadata when
> needed.
> 
> With external snapshots we need data that libvirt gets from running QEMU
> process so if the VM is not running we need to start paused QEMU process
> for the snapshot deletion and kill at afterwards.
> 
> Signed-off-by: Pavel Hrdina <phrdina at redhat.com>
> ---
>  src/qemu/qemu_snapshot.c | 144 ++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 142 insertions(+), 2 deletions(-)

First a few general notes for all upcoming patches.

1) All of the non-trivial functions dealing with snapshots should have
documentation explaing what they do. The snapshot code, as you certainly
know, is very complex and in many cases un-obvious. Having them will
save us (or anybody else wanting to maintain the code) some headaches in
the future.

2) I'd also suggest that you add a knowledge base internals article
outlining how snapshot deletion is supposed to work. In case there are
some user-visible caveats, we also have an user-focused article on
snapshots ( kbase/snapshots.rst ).

3) I'll try to assess the code also from the point of view of changes
needed to implement reversion of snapshots and the intricacies that
might bring for deletion.

> diff --git a/src/qemu/qemu_snapshot.c b/src/qemu/qemu_snapshot.c
> index da9b4c30f0..dade5dcea4 100644
> --- a/src/qemu/qemu_snapshot.c
> +++ b/src/qemu/qemu_snapshot.c
> @@ -2232,6 +2232,120 @@ qemuSnapshotRevert(virDomainObj *vm,
>  }
>  
>  
> +typedef struct {
> +    virDomainMomentObj *parentSnap;
> +    virDomainSnapshotDiskDef *snapDisk;
> +    virDomainDiskDef *domDisk;
> +    virDomainDiskDef *parentDomDisk;
> +    virStorageSource *diskSrc;
> +    virStorageSource *parentDiskSrc;
> +    virStorageSource *prevDiskSrc;
> +    qemuBlockJobData *job;
> +} qemuSnapshotDeleteExternalData;
> +
> +
> +static virDomainMomentObj*
> +qemuSnapshotFindParentSnapForDisk(virDomainMomentObj *snap,
> +                                  virDomainSnapshotDiskDef *snapDisk)
> +{
> +    virDomainMomentObj *parentSnap = snap->parent;
> +
> +    while (parentSnap) {
> +        ssize_t i;
> +        virDomainSnapshotDef *parentSnapdef = virDomainSnapshotObjGetDef(parentSnap);
> +
> +        if (!parentSnapdef)
> +            break;
> +
> +        for (i = 0; i < parentSnapdef->ndisks; i++) {
> +            virDomainSnapshotDiskDef *parentSnapDisk = &(parentSnapdef->disks[i]);
> +
> +            if (parentSnapDisk->snapshot != VIR_DOMAIN_SNAPSHOT_LOCATION_NO &&
> +                STREQ(snapDisk->name, parentSnapDisk->name)) {
> +                return parentSnap;
> +            }
> +        }
> +
> +        parentSnap = parentSnap->parent;
> +    }
> +
> +    return NULL;
> +}
> +
> +
> +static GPtrArray*
> +qemuSnapshotDeleteExternalPrepare(virDomainObj *vm,
> +                                  virDomainMomentObj *snap)
> +{
> +    ssize_t i;
> +    virDomainSnapshotDef *snapdef = virDomainSnapshotObjGetDef(snap);
> +    g_autoptr(GPtrArray) ret = g_ptr_array_new_full(0, g_free);

Is there any specific requirement to access these randomly?
(specifically why GPtrArray is used instead of e.g. a G(S)List?

> +    for (i = 0; i < snapdef->ndisks; i++) {
> +        g_autofree qemuSnapshotDeleteExternalData *data = NULL;
> +        virDomainSnapshotDiskDef *snapDisk = &(snapdef->disks[i]);
> +
> +        if (snapDisk->snapshot == VIR_DOMAIN_SNAPSHOT_LOCATION_NO)
> +            continue;

[...]

>  typedef struct _virQEMUMomentReparent virQEMUMomentReparent;
>  struct _virQEMUMomentReparent {
>      const char *dir;
> @@ -2504,9 +2618,9 @@ qemuSnapshotDeleteValidate(virDomainMomentObj *snap,
>      }
>  
>      if (virDomainSnapshotIsExternal(snap) &&
> -        !(flags & VIR_DOMAIN_SNAPSHOT_DELETE_CHILDREN_ONLY)) {
> +        (flags & VIR_DOMAIN_SNAPSHOT_DELETE_CHILDREN)) {
>          virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
> -                       _("deletion of external disk snapshots not supported"));
> +                       _("deletion of external disk snapshot with internal children disk snapshots not supported"));

This seems a bit unrelated to this patch.

>          return -1;
>      }
>  
> @@ -2523,6 +2637,8 @@ qemuSnapshotDelete(virDomainObj *vm,
>      int ret = -1;
>      virDomainMomentObj *snap = NULL;
>      bool metadata_only = !!(flags & VIR_DOMAIN_SNAPSHOT_DELETE_METADATA_ONLY);
> +    bool stop_qemu = false;
> +    g_autoptr(GPtrArray) externalData = NULL;
>  
>      virCheckFlags(VIR_DOMAIN_SNAPSHOT_DELETE_CHILDREN |
>                    VIR_DOMAIN_SNAPSHOT_DELETE_METADATA_ONLY |
> @@ -2537,6 +2653,25 @@ qemuSnapshotDelete(virDomainObj *vm,
>      if (!metadata_only) {
>          if (qemuSnapshotDeleteValidate(snap, flags) < 0)
>              goto endjob;
> +
> +        if (virDomainSnapshotIsExternal(snap) &&
> +            !(flags & (VIR_DOMAIN_SNAPSHOT_DELETE_CHILDREN |
> +                       VIR_DOMAIN_SNAPSHOT_DELETE_CHILDREN_ONLY))) {
> +            if (!virDomainObjIsActive(vm)) {
> +                if (qemuProcessStart(NULL, driver, vm, NULL, VIR_ASYNC_JOB_NONE,
> +                                     NULL, -1, NULL, NULL,
> +                                     VIR_NETDEV_VPORT_PROFILE_OP_CREATE,
> +                                     VIR_QEMU_PROCESS_START_PAUSED) < 0) {

Note that a user might attempt to call a 'virsh resume' here. This might
be a problem especially if you went the route of asynchronously handling
the deletion.

> +                    goto endjob;
> +                }
> +
> +                stop_qemu = true;
> +            }
> +
> +            externalData = qemuSnapshotDeleteExternalPrepare(vm, snap);
> +            if (!externalData)
> +                goto endjob;

Preferrably this operation will be done before starting qemu to do the
actual deletion. I understand that it has to be done after, to have
correct pointers after the copy of the definition needed for the
startup.

In this instance it probably is still better to do the checks first,
throw them away if qemu startup is needed and re-do them after, as the
startup of qemu is a waaay more expensive operation.

> +        }
>      }
>  
>      if (flags & (VIR_DOMAIN_SNAPSHOT_DELETE_CHILDREN |
> @@ -2547,6 +2682,11 @@ qemuSnapshotDelete(virDomainObj *vm,
>      }
>  
>   endjob:
> +    if (stop_qemu) {
> +        qemuProcessStop(driver, vm, VIR_DOMAIN_SHUTOFF_SHUTDOWN,
> +                        VIR_ASYNC_JOB_NONE, 0);

So definitely user interaction of the VM needs to be forbidden during
this API because it would be bad if the VM gets terminated here.

> +    }
> +
>      qemuDomainObjEndJob(vm);
>  
>      return ret;
> -- 
> 2.37.2
> 


More information about the libvir-list mailing list