[libvirt] [PATCH v3 2/5] vz: add migration backbone code

Mon Aug 31 08:40:55 UTC 2015

On 28.08.2015 19:37, Daniel P. Berrange wrote:
> On Fri, Aug 28, 2015 at 12:18:30PM +0300, Nikolay Shirokovskiy wrote:
>>
>>
>> On 27.08.2015 13:34, Daniel P. Berrange wrote:
>>> On Tue, Aug 25, 2015 at 12:04:14PM +0300, nshirokovskiy at virtuozzo.com wrote:
>>>> From: Nikolay Shirokovskiy <nshirokovskiy at virtuozzo.com>
>>>>
>>>> This patch makes basic vz migration possible. For example by virsh:
>>>>   virsh -c vz:///system migrate --direct $NAME $STUB vz+ssh://$DST/system
>>>>
>>>> $STUB could be anything as it is required virsh argument but it is not
>>>> used in direct migration.
>>>>
>>>> Vz migration is implemented as direct migration. The reason is that vz sdk do
>>>> all the job. Prepare phase function is used to pass session uuid from
>>>> destination to source so we don't introduce new rpc call.
>>>
>>> Looking more closely at migration again, the scenario you have is pretty
>>> much identical to the Xen scenario, in that the hypervisor actually
>>> manages the migration, but you still need a connection to dest libvirtd
>>> to fetch some initialization data.
>>>
>>> You have claimed you are implementing, what we describe as "direct, unmanaged"
>>> migration on this page:
>>>
>>>   http://libvirt.org/migration.html
>>>
>>> But based on the fact that you need to talk to dest libvirtd, you should
>>> in fact implement 'direct, managed' migration - this name is slightly
>>> misleading as the VZ SDK is still actually managing it.
>>>
>>> Since you don't need to have the begin/confirm phases, you also don't
>>> need to implement the V3 migration protocol - it is sufficient to just
>>> use V1.
>>>
>>> This doesn't need many changes in your patch fortunately.
>>>
>>>
>>
>> I've been looking at common migration code for rather long time and think that
>> using direct managed scheme for vz migration could lead to problems. Let me
>> share my concerns.
>>
>> 1. Migration protocol of version1 differs from version3 not only by number of
>> stages. Version3 supports extended parameters like
>> VIR_MIGRATE_PARAM_GRAPHICS_URI which have meaning for vz migration too. Thus in
>> future we could move to implementing version3 as well.
> 
> Ah, that is indeed true. From that POV it certainly makes sense to want
> to start with V3 straight away.
> 
>> 2. Direct managed stages doesn't have a meaning do anything on source, then on
>> destination and so on. They interconnected and this interconnection is given in
>> migration algorithm. For version3 (virDomainMigrateVersion3Full) it is more
>> noticeable. If finish3 phase fail then we cancel migration on confirm3 phase.
>> See, we treat this phases specifically - on perform3 we think we move data, on
>> finish we think we start domain on destination, on comfirm we think we stop
>> domain on source. That is how qemu migration works and that is how we think of
>> phases when we implement direct managed algorithm. So phases have some
>> contracts. If we implement vz migration thru this scheme we could not keep
>> these contracts as perform3 phase not only move data, but also kill source
>> domain and start destination. The worst things the user could get are an
>> erroneous warnings in logs and overall migration failure reports on actual
>> migration success in case of side effect failures like rpc or OOM. The worser
>> is that you should keep in mind that phases imlementation contracts are vague.
> 
> It isn't the end of the world if the Perform3 stage kills the source domain.
> That is the same behaviour as with Xen. Particularly since the VZ SDK itself
> does the switchover, there's no functional downside to letting Perform3
> kill the source.
>

I can not quite agree with you. Yes, luckily, vz migration could be
implemented via existent 5-phases interface and existing managing
virDomainMigrateVersion3Full algorithm but this is fragile. I mean
as phases have different meaning for qemu and vz in future 
if virDomainMigrateVersion3Full will be somehow changed this could lead 
to improper functioning of vz migration. As change will be done 
with qemu meaning for phases in mind.

>> So as as version1 scheme is quite simple and phase contracts are looser that
>> for version3 we could go this way but i see potential problems (at least for
>> developer). Thus suggest keep contracts of phases of all versions of direct
>> managed migration clear and hide all differences by implementing p2p or direct
>> scheme.
>>
>> The questing arises how these two differ. Documentation states that p2p is when
>> libvirt daemon manages migrations and direct is when all managing is done by
>> hypervisor. As vz migration needs some help from destination daemon it looks
>> like a candidate for p2p. But as this help is as just little as help
>> authenticate i suggest to think of it as of direct. From implementation point
>> of view there is no difference, from user point of view the difference is only
>> in flags. Another argument is that if we take qemu we see that p2p is just go
>> thru same steps as direct managed, the most of difference is that managing move
>> from client to daemon. That is p2p and direct managed are some kind of coupled.
>> If there is p2p then direct managed should be possible too and this is not the
>> case of vz migration.
> 
> The p2p migration mode is only different from the default mode,
> in that instead of the client app talking to the dest libvirtd,
> it is the source libvirtd talking.
> 
> With the VZ driver though, the driver runs directly in the client
> app, not libvirtd. As such, there is no benefit to implementing
> p2p mode in VZ - it will just end up working in the exact same
> way as the default mode in terms of what part communicates with
> the dest.

Again can not quite agree. Vz could run on libvirtd too and someone
could want whole managing to be done on libvirtd in this case. Thus user
expects there is either direct or p2p migration exists.

Another reason is that it would be simplier to support vz
migration in openstack nova. It uses toURI2 to migrate and
I would better run vz driver on libvirtd and use p2p or direct
migration rather then introduce a branch for vz to use
MigrateN API with a client side driver.

> 
> As you do need to talk to dest libvirtd, IMHO, this rules out
> use of the direct mode, as that is intended for the case where
> you don't ever use libvirtd on the target. This is why you ended
> up having the wierd situation with passing a dummy URI to virsh,
> and then passing a libvirt URI as the second parameter. This leads
> to a rather confusing setup for apps IMHO.

Actually dummy URI is not caused by some kind of improper use
from my side. If someone wants to use existing direct migration
it end up passing URIs in this manner. Cause is that from one side
'dconnuri' is a required parameter for virsh and from other side
it is ignored in direct migrations.

> 
> 
> So, overall I think you should still do what I suggested, but instead
> of implementing v1, stay with v3.  This will mean you have to provide
> the full set of Begin, Prepare, Perform, Finish, Confirm callbacks,
> but you can just have Begin,Finish,Confirm be a no-op for your initial
> impl.
>

> Regards,
> Daniel
>