[Linux-cluster] Cluster Setup Questions

Wed May 17 11:19:57 UTC 2006

Somehow, I din't see two my previous messages in the list.
So I try a third time.

======================
Hello Everyone.

I read the manual and the Knowledge Base, and the posts here and still
there is so much unclear that I have to ask for your support. I have a
complex  task,  so  prepare  for  a large post and complex setup. I am
mostly  new to clustering and I was given a fixed set of options, so I
wasn't able to choose hardware and software and must operate with what
I  was  given.  So,  below  are mine considerations on the "project" I
have. I ask everyone who met such setups before to help me.

The task & the environment:
===========================

I need to setup a highly-availabile cluster to run two apps (Task1 and
Task2)  that  also  communicate  tightly with each other (So I want an
"Active-Active"  setup).  I  want  to  have  as  much availability and
continuity  as  possible within my current environment (i.e. w/o bying
much  additional  hardware  and  largely  redesigning  the network).
The scheme that illustrates the setup is available here:
www.atlas.ua/apc/cluster-post.jpg (93К).

The show takes place in a large organization with two datacenters that
are   physically   separated  on  distant  sites  in  the  city  (like
catastrophe-proof  setup).  The  datacenters  are connected with Fiber
optic  running  a  number  of  protocols (so called 'Metropolitan Area
Network' - MAN). Among them there are SDH,  Gigabit Ethernet and Fible
Channel SAN.

In each datacenter there is
 * a SAN Storage (IBM DS8100, IBM ESS "Shark")
 * two SAN Switches (IBM TotalStorage SAN Switch 2109-F32)
 *  DB/App  Server  (IBM  IBM  eServer  xSeries  460 w 2 FC HBAs, each
 plugged  in  separate  SAN  Switch  and  multipathing,  running  RHEL
 4.2+RHCS)
 * Front-End Server (Win2K3 + IIS/ASP)
The SAN Storages have mirroring between them (can be turned on and off
for each partition).

Availability requirements:
==========================
A1)  I  want Active-Active configuration, where Node1 is running Task1
and  backing  up  Task2,  and Node2 is running Task2 and is backing up
Task1.
A2)  I  want  that  if  one  SAN Storage fails, the cluster could work
further without need to restart processes.
A3)  I  want  that  if  one  HBA  of SAN switch fails i could continue
working (currently solved with IBMsdd multipath driver)
A4)  If  server  HardDrive  fails  the  server should continue working
(solved by creating RAID 1 on servers' hot-swappable drives).
A5)  If one of two server's LAN links fails the server should continue
working  (want  to  resolve  it  by  trunking  VLANs  over  the bonded
interfaces,  but have doubts if it is possible - how does that bonding
work???).

Limitations:
============
L1) It's impossible to obtain Storage Virtualization Manager (due to a
lot  of bureaucratic problems) to make both SAN storages appear as one
to RHCS and solve requirement A3. :(((
L2)  I  am  limited  to RHEL 4u2, RHCS 4u2, because the soft is highly
sensitive to a lot of Linux bugs that appear and disappear from update
to  update..  When  I'll be able to get the cluster running on RHEL4.2
I'll make developers adapt it to QU3 or whatever it'll be.
L2.1) Also, where is no GFS.
L3)  The  staff  running  all  the  SAN,  actually, can only find that
something's  broken.  Those,  who implemented the entire large network
are long gone, and the current guys have no or little understanding of
what  their  boxes  can  do  and how ;( I'm not familiar with IBM SANs
either :( (though, I'm studying hard now).

What I did now:
===============
* attached all the hardware together.
* connected partitions on SAN storages.
* decided IP addresses.
* established two VLANs:
 + for servers' heartbeat
 + for public interfaces and common cluster IP.
* Installed RHEL 4.2, IBM sdd driver, RHCS

What has to be decided:
=======================
Currently, no fencing is decided and two options are considered:
 1. APC AP7921 Rack PDUs (still don't quite imagine how it'll work
 with servers with dual power supplies, but this can be solved)
 2. Integrated IBM Slimline RSA II controller (Don't know if it's
 supported by RHCS).

I  found  from  the  mailing  list, that there are problems with power
fencing   when  losing  LAN  connectivity  (no  heartbeat  =>  servers
powercycle each other). Haven't found a way to overcome it, though ;(

The  manual says that RHCS can work with single storage only. Is there
a  way  to  overcome it without storage virtualization? I really don't
want building software RAID arrays over MAN.

If  not  - I can establish SAN mirroring then, and mirror all the suff
from  "primary"  storage  to  backup  storage.  Is there a way to make
cluster   automatically   check   health   of   "active"  storage  and
automatically  remount  FS  from backup storage if active storage goes
down?  The only solution I found is to do this from customized service
status script..

If  so  -  can I run the cluster without common FS resource? My simple
tests show that that's possible, but want proofs.

The questions:
==============
Besides the questions asked above, the two main questions are:
 * How should I design the cluster so it works like I want?
 * How should I implement it?

Some smaller questions:
* If the status script returns 0 - status is ok. What happens if not?
Does the Cluster Software first try to restart the service or fails it
over to the next node? If tries to restart - by which means ('service
XXX restart', kill -9, or smth else)?

*  How  does  cluster  software  check  the  status  of infrastructure
resources  (IPs,  FSes, heartbeat). Can I change the course of actions
there by writing my own scripts?

Thanks in advance to those brave enough to answer =)

-- 
С уважением,
 iana                          mailto:iana at atlas.ua