[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [virt-tools-list] Live migration of iscsi based targets

 On 10/27/2010 03:10 AM, Gildas Bayard wrote:

I'm using libvirt and KVM for a dozen of virtual servers. Each virtual server's disk is an iscsi LUN that is mounted by the physical host blade which runs KVM. Every thing works fine at that stage for about a year. Both the servers and the blade are running ubuntu server 10.04 LTS. I've been trying live migration for a while but it was not working, at least with my setup, on previous versions of ubuntu (actually virt-manager was showing the VM on the target host but the machine became unreachable by the network).

Anyway for some reasons it's working now. But there's a big pb: let say I use 2 blades (A and B) to host my VMs. If I start a VM on blade A and live migrate it to blade B everything is fine. But if I migrate it back to blade A awful things happen: at first it's ok but sooner or later the VM will complain about disk corruption and destroy itself more and more as time goes by. Oops!

My understanding is that blade A got it's iscsi disk cache up and running and that when the VM comes back, blade A has no way to know that the VM got its disk altered by its stay on blade B for a while. Hence the corruption.

Am I getting this correct? Should I switch to NFS "disk in a file" instead of using iscsi?


This is how we do it here.

Prior to the live migrate:

Add permissions to the iscsi target for the New Host.

New Host discovers and logs into the iscsi target.

Live migrate:
virsh -c qemu+ssh://root\@<Old host>/system migrate --live <domain> qemu+ssh://root\@<New Host>/system"

On the new host:
   virsh dumpxml <domain> > /tmp/<domain>.xml
   virsh define /tmp/<domain>.xml

On the old host
   virsh undefine <domain>
   iscsiadm -m node --logout -T <iscsi-target>
   iscsiadm -m node -T <iscsi-target> -o delete

Remove permissions from the iscsi target for the Old Host

That seems to work reliably for us.

As for iscsi vs. NFS:

We tested this extensively and determined that a reasonably optimized iscsi based storage array using one iscsi target per VM achieved vastly improved I/O performance.

We tested Linux based iscsi target hosts using both TGT and iscsi-target, as well as several vendor provided iscsi storage arrays.

As a result of this testing, we settled on the Dell EquaLogic arrays. They can support 512 connections per pool, which (barely) meets our very specific requirements.

But the kicker for EQ is the dedicated 4GB cache. Working within the cache from up to 500 VMs, we were able to achieve a sustained 7000 iops using 14 normal SATA drives (16 in RAID 6).

Normally those drives on a pair of 3Ware controllers (512MB cache on the controller) maxed out at 1500 iops no matter how we configured the targets.

We are keeping a close eye on the Isilon products, and we have an Isilon cluster in production for NFS, but they now have a fair to good iscsi implementation and its getting better each release. Version 6 may bring it to par (or better) with anyone else. However, the Isilon's ability to grow nearly forever with no hassle is a major selling point. Dropping a new unit into the cluster and having those new resources available within 60 seconds is a very long way from the 36 hour wait to VERIFY a new EQ array!

Good Luck!

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]