[Linux-cluster] High availability xen cluster

Thu Feb 28 08:02:17 UTC 2008

Hi,
I guess this is not only the problem of data being available in both sites. You need true geographic cluster, which is usually different to typical HA cluster in one site. That is why you have that kind of problems. 

>> But my problem is that I see duplicate pv id's in this situatio
Why ?
In both sites you must use only disk from LOCAL storage array and define replication beetwen them. When the application runs in rza site data is beeing replicated  to rzb site. In rzb site disks are in read only mode. In case of disaster you only need to do a failover to rzb site, start to replicate data from rzb->rza and run application in rzb site. That is the theory - but I did not implemented such configuration already. Besides you need a script to do a failover/failback.

Second solution is a gnbd which is fully supported.

Best regards,
Tomek

On Wed, 27 Feb 2008 20:23:18 +0100
van Schelve <public at van-schelve.de> wrote:

> Hello. I don't know if this here is the right list to discuss this but
> maybe someone else with an equal scenario can describe how to best solve
> this.
> 
> I plan to build a cluster environment with 6 or 8 nodes to provide xen
> based virtual maschines. In our environment we have two data centre in
> short rza and rzb. In addition we have a zoned SAN infra structure with a
> Hitachi OPEN-V fabric in each data centre. My nodes are available to access
> disks from both fabrics over multiple pathes. Please look at the following
> multipath listing from one of my nodes: 
> 
> [root at ux072 ~]# multipath -l
> mpath2_rzb (1HITACHI_R450F8D41022) dm-9 HITACHI,OPEN-V
> [size=30G][features=1 queue_if_no_path][hwhandler=0]
> \_ round-robin 0 [prio=0][active]
>  \_ 1:0:0:2 sdd 8:48   [active][undef]
>  \_ 1:0:1:2 sdg 8:96   [active][undef]
>  \_ 2:0:0:2 sdp 8:240  [active][undef]
>  \_ 2:0:1:2 sds 65:32  [active][undef]
> mpath2_rza (1HITACHI_R450F94D1022) dm-12 HITACHI,OPEN-V
> [size=30G][features=1 queue_if_no_path][hwhandler=0]
> \_ round-robin 0 [prio=0][active]
>  \_ 1:0:2:2 sdj 8:144  [active][undef]
>  \_ 1:0:3:2 sdm 8:192  [active][undef]
>  \_ 2:0:2:2 sdv 65:80  [active][undef]
>  \_ 2:0:3:2 sdy 65:128 [active][undef]
> mpath0_rzb (1HITACHI_R450F8D41020) dm-7 HITACHI,OPEN-V
> [size=30G][features=1 queue_if_no_path][hwhandler=0]
> \_ round-robin 0 [prio=0][active]
>  \_ 1:0:0:0 sdb 8:16   [active][undef]
>  \_ 1:0:1:0 sde 8:64   [active][undef]
>  \_ 2:0:0:0 sdn 8:208  [active][undef]
>  \_ 2:0:1:0 sdq 65:0   [active][undef]
> mpath0_rza (1HITACHI_R450F94D1020) dm-10 HITACHI,OPEN-V
> [size=30G][features=1 queue_if_no_path][hwhandler=0]
> \_ round-robin 0 [prio=0][active]
>  \_ 1:0:2:0 sdh 8:112  [active][undef]
>  \_ 1:0:3:0 sdk 8:160  [active][undef]
>  \_ 2:0:2:0 sdt 65:48  [active][undef]
>  \_ 2:0:3:0 sdw 65:96  [active][undef]
> mpath1_rzb (1HITACHI_R450F8D41021) dm-8 HITACHI,OPEN-V
> [size=30G][features=1 queue_if_no_path][hwhandler=0]
> \_ round-robin 0 [prio=0][active]
>  \_ 1:0:0:1 sdc 8:32   [active][undef]
>  \_ 1:0:1:1 sdf 8:80   [active][undef]
>  \_ 2:0:0:1 sdo 8:224  [active][undef]
>  \_ 2:0:1:1 sdr 65:16  [active][undef]
> mpath1_rza (1HITACHI_R450F94D1021) dm-11 HITACHI,OPEN-V
> [size=30G][features=1 queue_if_no_path][hwhandler=0]
> \_ round-robin 0 [prio=0][active]
>  \_ 1:0:2:1 sdi 8:128  [active][undef]
>  \_ 1:0:3:1 sdl 8:176  [active][undef]
>  \_ 2:0:2:1 sdu 65:64  [active][undef]
>  \_ 2:0:3:1 sdx 65:112 [active][undef]
> 
> There are three disks in each data centre. Currently I only use the disks
> in rza.
> 
> At the moment I'm testing with a two node cluster. The virtual machines are
> on SAN disks and I can live migrate from one node to the other one. But
> what I have to cover is the disaster. What happens when the fabric in rza
> crashes? My virtual maschines are unavailable. What I'm thinking about is a
> hardware based
> mirroring between the both fabrics and break up the mirror when the
> disaster happens or we need to power off the storage for maintenance. But
> my problem is that I see duplicate pv id's in this situation. I cannot
> mirror based on lvm because it is too slow.
> 
> Thank you for your ideas.
> 
> -- Hans-Gerd van Schelve
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

-- 
Tomasz Sucharzewski <tsucharz at poczta.onet.pl>