[libvirt] [PATCH 04/10] qemu: Recover from interrupted migration

Eric Blake eblake at redhat.com
Fri Jul 22 21:06:01 UTC 2011


On 07/18/2011 06:27 PM, Jiri Denemark wrote:
> ---
>   src/qemu/qemu_process.c |  110 ++++++++++++++++++++++++++++++++++++++++++++++-
>   1 files changed, 109 insertions(+), 1 deletions(-)

>
>   static int
> +qemuProcessRecoverMigration(struct qemud_driver *driver,
> +                            virDomainObjPtr vm,
> +                            virConnectPtr conn,
> +                            enum qemuDomainAsyncJob job,
> +                            enum qemuMigrationJobPhase phase,
> +                            virDomainState state,
> +                            int reason)
> +{
> +    if (job == QEMU_ASYNC_JOB_MIGRATION_IN) {
> +        switch (phase) {
> +        case QEMU_MIGRATION_PHASE_NONE:
> +        case QEMU_MIGRATION_PHASE_PERFORM2:
> +        case QEMU_MIGRATION_PHASE_BEGIN3:

Should we reject as impossible the phases that should never be 
encountered on MIGRATION_IN?  For example, QEMU_MIGRATION_PHASE_BEGIN3 
belongs to MIGRATION_OUT, so if our job is MIGRATION_IN but we see that 
phase, we should probably fail rather than return 0.

> +        case QEMU_MIGRATION_PHASE_PERFORM2:
> +        case QEMU_MIGRATION_PHASE_PERFORM3:
> +            /* migration is still in progress, let's cancel it and resume the
> +             * domain */
> +            VIR_DEBUG("Canceling unfinished outgoing migration of domain %s",
> +                      vm->def->name);
> +            /* TODO cancel possibly running migrate operation */

As in issue qemuMonitorMigrateCancel, but ignoring if it fails?  Might 
be reasonable, but probably as a separate patch.

> +            /* resume the domain but only if it was paused as a result of
> +             * migration */
> +            if (state == VIR_DOMAIN_PAUSED&&
> +                (reason == VIR_DOMAIN_PAUSED_MIGRATION ||
> +                 reason == VIR_DOMAIN_PAUSED_UNKNOWN)) {
> +                if (qemuProcessStartCPUs(driver, vm, conn,
> +                                         VIR_DOMAIN_RUNNING_UNPAUSED)<  0) {

On the other hand, will the monitor command to restart cpus even work if 
a pending migration is underway?  So we may have to do the 
qemuMonitorMigrateCancel no matter what, to ensure the monitor will let 
us resume.

I think what you have works (strict improvement over what we have now), 
even if it can be further improved with later patches, so:

ACK.

-- 
Eric Blake   eblake at redhat.com    +1-801-349-2682
Libvirt virtualization library http://libvirt.org




More information about the libvir-list mailing list