[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [libvirt] [PATCH v2.1 10/11] qemu_migration: Check size prerequisites



On Fri, Jan 11, 2013 at 17:52:22 +0100, Michal Privoznik wrote:
> With new NBD storage migration approach there are several
> requirements that need to be meet for successful use of
> the feature. One of them is - the file representing a disk,
> needs to have at least same size as on the source. Hence,
> we must transfer a list of pairs [disk source, size] and
> check on destination that this requirement is met and/or
> take actions to meet it.

The commit message does not correspond to what the code is doing. You're
not sending (disk source, size) pairs.

> ---
>  src/qemu/qemu_migration.c | 164 +++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 161 insertions(+), 3 deletions(-)
> 
> diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c
> index 1f9a880..86eb4c8 100644
> --- a/src/qemu/qemu_migration.c
> +++ b/src/qemu/qemu_migration.c
> @@ -29,6 +29,9 @@
>  #endif
>  #include <fcntl.h>
>  #include <poll.h>
> +#include <unistd.h>
> +#include <sys/types.h>
> +#include <sys/stat.h>
>  
>  #include "qemu_migration.h"
>  #include "qemu_monitor.h"
> @@ -137,6 +140,18 @@ struct _qemuMigrationCookieNBD {
>                   Negative one is meant to be sent when translating from
>                   perform to finish phase to let destination know it's
>                   safe to stop NBD server.*/
> +
> +    /* The list of pairs [disk-size] (in Bytes).  This is needed

What a strange pair which contains just disk size :-P

> +     * because the same disk size is one of prerequisites for NBD
> +     * storage migration. Unfortunately, we can't rely on
> +     * anything but disk order, since 'src' can be overwritten by
> +     * migration hook script, device aliases are not assigned on
> +     * dst yet (as the source files need to be created before
> +     * qemuProcessStart). */

Hmm, can't we use disk targets (hda, vdb, ...)?

> +    size_t ndisks;
> +    struct {
> +        size_t bytes;
> +    } *disk;
>  };
>  
>  typedef struct _qemuMigrationCookie qemuMigrationCookie;
...
> @@ -534,6 +565,33 @@ qemuMigrationCookieAddNBD(qemuMigrationCookiePtr mig,
>          return -1;
>      }
>  
> +    /* in Begin phase add info about disks */
> +    if (priv->job.phase == QEMU_MIGRATION_PHASE_BEGIN3 &&

I think this check is better replaced with nbdPort == 0.

> +        vm->def->ndisks) {
> +        if (VIR_ALLOC_N(mig->nbd->disk, vm->def->ndisks) < 0) {
> +            virReportOOMError();
> +            return -1;
> +        }

Yay, this is what I suggested you to do in
qemuMigration{Start,Stop}NBDServer :-)

> +
> +        for (i = 0; i < vm->def->ndisks; i++) {
> +            virDomainDiskDefPtr disk = vm->def->disks[i];
> +            struct stat sb;
> +
> +            /* Add only non-shared disks with source */
> +            if (!disk->src || disk->shared)
> +                continue;
> +
> +            if (stat(disk->src, &sb) < 0) {
> +                virReportSystemError(errno,
> +                                     _("Unable to stat '%s'"),
> +                                     disk->src);
> +                return -1;
> +            }
> +
> +            mig->nbd->disk[mig->nbd->ndisks++].bytes = sb.st_size;

Unfortunately, this is wrong. As confirmed with Paolo on IRC today, the
disk image on destination needs to be fully allocated. That is, we need
to transfer the logical size of the image rather than its physical size.
For a 10GB QCOW image that occupies 1GB of disk space, we need to create
10GB file (+ the size of QCOW metadata) on the destination.

> +        }
> +    }
> +
>      mig->nbd->port = nbdPort;
>      mig->flags |= QEMU_MIGRATION_COOKIE_NBD;
>  
...
> @@ -1347,6 +1439,68 @@ error:
>      goto cleanup;
>  }
>  
> +static int
> +qemuMigrationPreCreateStorage(virDomainObjPtr vm,
> +                              qemuMigrationCookiePtr mig)
> +{
> +    int ret = -1;
> +    size_t i, mig_i = 0;
> +    struct stat sb;
> +    int fd = -1;
> +
> +    if (!mig->nbd || !mig->nbd->ndisks) {
> +        /* nothing to do here */
> +        return 0;
> +    }
> +
> +    for (i = 0; i < vm->def->ndisks; i++) {
> +        virDomainDiskDefPtr disk = vm->def->disks[i];
> +        size_t bytes = mig->nbd->disk[mig_i].bytes;
> +
> +        /* skip shared and source-free disks */
> +        if (!disk->src || disk->shared)
> +            continue;
> +
> +        mig_i++;
> +        if (mig_i > mig->nbd->ndisks) {
> +            virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
> +                           _("disk count doesn't match"));
> +            goto cleanup;
> +        }
> +
> +        VIR_DEBUG("Checking '%s' for its size (requested %zuB)", disk->src, bytes);
> +
> +        if ((fd = virFileOpenAs(disk->src, O_RDWR | O_CREAT, 0660,
> +                                -1, -1, VIR_FILE_OPEN_NOFORK)) < 0) {
> +            virReportSystemError(errno, _("Unable to create '%s'"), disk->src);
> +            goto cleanup;
> +        }
> +
> +        if (fstat(fd, &sb) < 0) {
> +            virReportSystemError(errno, _("Unable to stat '%s'"), disk->src);
> +            goto cleanup;
> +        }
> +
> +        VIR_DEBUG("File '%s' is %zuB big", disk->src, sb.st_size);
> +        if (sb.st_size < bytes &&
> +            ftruncate(fd, bytes) < 0) {

posix_fallocate would be a better choice; or just use existing functions
used in storage driver.

> +            virReportSystemError(errno, _("Unable to ftruncate '%s'"), disk->src);
> +            goto cleanup;
> +        }
> +
> +        VIR_FORCE_CLOSE(fd);
> +    }
> +
> +    ret = 0;
> +cleanup:
> +    VIR_FORCE_CLOSE(fd);
> +    /* free from migration data to prevent
> +     * infinite sending from src to dst and back */
> +    VIR_FREE(mig->nbd->disk);
> +    mig->nbd->ndisks = 0;
> +    return ret;
> +}
> +
>  /* Validate whether the domain is safe to migrate.  If vm is NULL,
>   * then this is being run in the v2 Prepare stage on the destination
>   * (where we only have the target xml); if vm is provided, then this
...

Jirka


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]