[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [libvirt] [PATCH 3/6] Introduce yet another migration version in API.



On 04/20/2011 11:38 PM, Christian Benvenuti (benve) wrote:
On 04/20/2011 05:28 PM, Christian Benvenuti (benve) wrote:
Daniel,
    I looked at the patch-set you sent out on the 2/9/11

    [libvirt] [PATCH 0/6] Introduce a new migration protocol
                          to QEMU driver
    http://www.mail-archive.com/libvir-list redhat com/msg33223.html

What is the status of this new migration protocol?
Is there any pending issue blocking its integration?

I would like to propose an RFC enhancement to the migration
algorithm.

Here is a quick summary of the proposal/idea.

- finer control on migration result

    - possibility of specifying what features cannot fail
      their initialization on the dst host during migration.
      Migration should not succeed if any of them fails.
      - optional: each one of those features should be able to
                  provide a deinit function to cleanup resources
                  on the dst host if migration fails.

This functionality would come useful for the (NIC) set port
profile feature VDP (802.1Qbg/1Qbh), but what I propose is
a generic config option / API that can be used by any feature.

And now the details.

----------------------------------------------
enhancement: finer control on migration result
----------------------------------------------

There are different reasons why a VM may need (or be forced) to
migrate.
You can classify the types of the migrations also based on
different semantics.
For simplicity I'll classify them into two categories, based on
how important it is for the VM to migrate as fast as possible:

(1) It IS important

     In this case, whether the VM will not be able to (temporary)
     make use of certain resources (for example the network) on the
     dst host, is not that important, because the completion of the
     migration is considered higher priority.
     A possible scenario could be a server that must migrate ASAP
     because of a disaster/emergency.

(2) It IS NOT important

     I can think of a VM whose applications/servers need a network
     connection in order to work properly. Loosing such network
     connectivity as a consequence of a migration would not be
     acceptable (or highly undesirable).

Given the case (2) above, I have a comment about the Finish
step, with regards to the port profile (VDP) codepath.

The call to

      qemuMigrationVPAssociatePortProfile

in
      qemuMigrationFinish

can fail, but its result (success or failure) does not influence
the result of the migration Finish step (it was already like this
in migration V2).
I *believe* the underlying problem is Qemu's switch-over. Once Qemu
decides that the migration was successful, Qemu on the source side
dies
and continues running on the destination side. I don't think there are
more handshakes foreseen with higher layers that this could be
reversed
or the switch-over delayed, but correct me if I am wrong...
Actually I think this is not what happens in migration V3.
My understanding is this:

- the qemu cmdline built by Libvirt on the dst host during Prepare3
   includes the "-S" option (ie no autostart)

- the VM on the dst host does not start running until libvirt
   calls qemuProcessStartCPUs in the Finish3 step.
   This fn simply sends the "-cont" cmd to the monitor to
   start the VM/CPUs.
That's correct, but it's doing this already in v2. The non-autostart (-S) corresponds to Qemu's autostart here (migration.c):

void process_incoming_migration(QEMUFile *f)
{
    if (qemu_loadvm_state(f) < 0) {
        fprintf(stderr, "load of migration failed\n");
        exit(0);
    }
    qemu_announce_self();
    DPRINTF("successfully loaded vm state\n");

    incoming_expected = false;

    if (autostart)
        vm_start();
}

and simply doesn't start the VM. After this function is called all sockets are closed and the communication with the source host is cut. I don't think it allows for fall-back at this point.
Rather we may need a 'wait' option for migration and before the

    qemu_put_byte(f, QEMU_VM_EOF);

in qemu_savevm_state_complete() sync with the monitor and either wait for something like migrate_finish or migrate_cancel.

Regards,
   Stefan


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]