[Linux-cluster] Cluster IP alias doesn't work

JACOB_LIBERMAN at Dell.com JACOB_LIBERMAN at Dell.com
Thu Jul 21 21:46:05 UTC 2005


It looks like your alias IP is not the same as your service IP, but that
is the problem I have seen in the past that causes this behavior. I
would specify the cluster alias via an IP address rather than a host
name.

The behavior I have seen is that you down the interface on the service
owner node, and the service starts on the other node, but the virtual IP
address does not bind to the second nodes interface. This can occur when
the alias IP is the same address and the service IP. This isnot a bug,
but working as desgined, because the alias IP will bind to any node
participating in the cluster, nt just the node that owns the service.
This feature is particularly useful in clusters with a large number of
nodes.

I hope this helps!

jacob 

> -----Original Message-----
> From: linux-cluster-bounces at redhat.com 
> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of haydar Ali
> Sent: Thursday, July 21, 2005 1:55 PM
> To: linux-cluster at redhat.com
> Subject: [Linux-cluster] Cluster IP alias doesn't work
> 
> Hi,
> 
> I setuped and configured a clustered NFS.
> I have created 2 quorum partitions /dev/sdd2 and /dev/sdd3  
> (100MB each).
> 
> I created another huge partition /dev/sdd4 (over 600GB) and 
> formatted it in
> ext3 file system.
> 
> I installed the cluster suite on the 1st node (RAC1) and 2nd 
> node RAC2 and I started the rawdevices on the two nodes RAC1 
> and RAC2 (it's OK).
> 
> This the hosts file /etc/host on the node1 (RAC1) and node2 RAC2
> 
> Do not remove the following line, or various programs # that 
> require network functionality will fail.
> #127.0.0.1		rac1 localhost.localdomain localhost
> 127.0.0.1              localhost.localdomain localhost
> #
> # Private hostnames
> #
> 192.168.253.3           rac1.project.net     rac1
> 192.168.253.4           rac2.project.net     rac2
> 192.168.253.10          raclu.project.net	raclu
> 192.168.253.20		raclu_nfs.project.net	raclu_nfs
> #
> # Hostnames used for Interconnect
> #
> 1.1.1.1                 rac1i.project.net    rac1i
> 1.1.1.2                 rac2i.project.net    rac2i
> #
> 192.168.253.5           infra.project.net       infra
> 192.168.253.7		ractest.project.net     ractest
> #
> I generated a /etc/cluster.conf on the 1st node RAC1 as following:
> 
> # This file is automatically generated.  Do not manually edit!
> 
> [cluhbd]
>   logLevel = 4
> 
> [clupowerd]
>   logLevel = 4
> 
> [cluquorumd]
>   logLevel = 4
> 
> [cluster]
>   alias_ip = raclu
>   name = project
>   timestamp = 1121957827
> 
> [clusvcmgrd]
>   logLevel = 4
> 
> [database]
>   version = 2.0
> 
> [members]
> start member0
> start chan0
>   name = rac1
>   type = net
> end chan0
>   id = 0
>   name = rac1
>   powerSwitchIPaddr = rac1
>   powerSwitchPortName = unused
>   quorumPartitionPrimary = /dev/raw/raw1
>   quorumPartitionShadow = /dev/raw/raw2
> end member0
> start member1
> start chan0
>   name = rac2
>   type = net
> end chan0
>   id = 1
>   name = rac2
>   powerSwitchIPaddr = rac2
>   powerSwitchPortName = unused
>   quorumPartitionPrimary = /dev/raw/raw1
>   quorumPartitionShadow = /dev/raw/raw2
> end member1
> 
> [powercontrollers]
> start powercontroller0
>   IPaddr = rac1
>   login = unused
>   passwd = unused
>   type = null
> end powercontroller0
> start powercontroller1
>   IPaddr = rac2
>   login = unused
>   passwd = unused
>   type = null
> end powercontroller1
> 
> [services]
> start service0
>   checkInterval = 0
> start device0
> start mount
> start NFSexports
> start directory0
> start client0
>   name = *
>   options = rw
> end client0
>   name = /u04
> end directory0
> end NFSexports
>   forceUnmount = yes
>   fstype = ext3
>   name = /u04
>   options = rw,nosuid,sync
> end mount
>   name = /dev/sdd4
>   sharename = None
> end device0
>   name = nfs_project
> start network0
>   ipAddress = 192.168.253.20
> end network0
>   preferredNode = rac2
>   relocateOnPreferredNodeBoot = yes
> end service0
> 
> 
> I generated a /etc/cluster.conf on the 2nd node RAC2 as following:
> 
> 
> # This file is automatically generated.  Do not manually edit!
> 
> [cluhbd]
>   logLevel = 4
> 
> [clupowerd]
>   logLevel = 4
> 
> [cluquorumd]
>   logLevel = 4
> 
> [cluster]
>   alias_ip = raclu
>   name = project
>   timestamp = 1121957827
> 
> [clusvcmgrd]
>   logLevel = 4
> 
> [database]
>   version = 2.0
> 
> [members]
> start member0
> start chan0
>   name = rac1
>   type = net
> end chan0
>   id = 0
>   name = rac1
>   powerSwitchIPaddr = rac1
>   powerSwitchPortName = unused
>   quorumPartitionPrimary = /dev/raw/raw1
>   quorumPartitionShadow = /dev/raw/raw2
> end member0
> start member1
> start chan0
>   name = rac2
>   type = net
> end chan0
>   id = 1
>   name = rac2
>   powerSwitchIPaddr = rac2
>   powerSwitchPortName = unused
>   quorumPartitionPrimary = /dev/raw/raw1
>   quorumPartitionShadow = /dev/raw/raw2
> end member1
> 
> [powercontrollers]
> start powercontroller0
>   IPaddr = rac1
>   login = unused
>   passwd = unused
>   type = null
> end powercontroller0
> start powercontroller1
>   IPaddr = rac2
>   login = unused
>   passwd = unused
>   type = null
> end powercontroller1
> 
> [services]
> start service0
>   checkInterval = 0
> start device0
> start mount
> start NFSexports
> start directory0
> start client0
>   name = *
>   options = rw
> end client0
>   name = /u04
> end directory0
> end NFSexports
>   forceUnmount = yes
>   fstype = ext3
>   name = /u04
>   options = rw,nosuid,sync
> end mount
>   name = /dev/sdd4
>   sharename = None
> end device0
>   name = nfs_project
> start network0
>   ipAddress = 192.168.253.20
> end network0
>   preferredNode = rac2
>   relocateOnPreferredNodeBoot = yes
> end service0
> 
> [session]
>   lock = rac1-cluadmin-16970-root 1121957668
> 
> I created a NFS share on /u04 (mount on /dev/sdd4) using the 
> following command cluadmin on RAC1 I launched on 2 nodes Rac1 
> and RAC2 the following command:
> Service cluster start
> I checked the result on the 2 nodes:
> clustat
> 
> 
> 
> Cluster Status Monitor (project)   14:21:38
> 
> Cluster alias: raclu
> 
> ==========  M e m b e r   S t a t u s  =============
> 
>   Member         Status     Node Id    Power Switch
>   -------------- ---------- ---------- ------------
>   rac1        Up         0          Good
>   rac2        Up         1          Good
> 
> ===========  H e a r t b e a t   S t a t u s  ======
> 
>   Name                           Type       Status
>   ------------------------------ ---------- ------------
>   rac1      <--> rac2      network    ONLINE
> 
> ========== S e r v i c e   S t a t u s  =======
> 
>                                          Last             
> Monitor  Restart
>   Service        Status   Owner          Transition       
> Interval Count
>   -------------- -------- -------------- ---------------- --------  
> nfs_project       started  rac2        13:01:36 Jul 21  0        0
> 
> 
> And I launched on RAC1 and other servers the following command:
> mount -t nfs 192.168.253.20:/u04 /u04
> 
> And all are OK, I can list the /u04 content from any server 
> (if I mount it).
> 
> But my only problem is:
> 
> When I want to try a test, I stop the network service on RAC2:
> ifconfig eth0 down
> Then when I try from another server as RAC1 to list the /u04 
> content, it 
> doesn't work and doesn't respond, and when I ping the IP alias 
> 192.168.253.20 it doesn't respond also.
> 
> Have you any idea to fix this problem?
> 
> Thanks for your replies and help
> 
> 
> Haydar
> 
> >From: "haydar Ali" <haydar2906 at hotmail.com>
> >Reply-To: linux clustering <linux-cluster at redhat.com>
> >To: linux-cluster at redhat.com
> >Subject: [Linux-cluster] Need help for Clustered NFS
> >Date: Wed, 20 Jul 2005 10:10:35 -0400
> >
> >Hi,
> >
> >I want to setup and configure clustered NFS.
> >I have created 2 quorum partitions /dev/sdd2 and /dev/sdd3  
> (100MB each) 
> >and formatted them
> >
> >mkfs -t ext2 -b 4096 /dev/sdd2
> >mkfs -t ext2 -b 4096 /dev/sdd3
> >
> >I created another huge partition /dev/sdd4 (over 600GB) and 
> formatted it in 
> >ext3 filesystem.
> >
> >I installed the cluster suite on the 1st node (RAC1) and I 
> started the 
> >rawdevices on the two nodes RAC1 and RAC2 (it's OK).
> >
> >This the hosts file /etc/host on the node1 (RAC1)
> >
> ># Do not remove the following line, or various programs
> ># that require network functionality will fail.
> >127.0.0.1              localhost.localdomain localhost
> >#
> ># Private hostnames
> >#
> >192.168.253.3           rac1.domain.net     rac1
> >192.168.253.4           rac2.domain.net     rac2
> >192.168.253.10          rac1
> >#
> ># Hostnames used for Interconnect
> >#
> >1.1.1.1                 rac1i.domain.net    rac1i
> >1.1.1.2                 rac2i.domain.net    rac2i
> >#
> >-----------------------
> >
> >
> >I launched the command cluconfig and it generated 
> /etc/cluster.conf, you 
> >can list its content:
> >
> >-------------------------------
> >This file is automatically generated.  Do not manually edit!
> >
> >[cluhbd]
> >  logLevel = 4
> >
> >[clupowerd]
> >  logLevel = 4
> >
> >[cluquorumd]
> >  logLevel = 4
> >
> >[cluster]
> >  alias_ip = 192.168.253.10
> >  name = project
> >  timestamp = 1121804245
> >
> >[clusvcmgrd]
> >  logLevel = 4
> >
> >[database]
> >  version = 2.0
> >
> >[members]
> >start member0
> >start chan0
> >  name = rac1
> >  type = net
> >end chan0
> >  id = 0
> >  name = rac1
> >  powerSwitchIPaddr = rac1
> >  powerSwitchPortName = unused
> >  quorumPartitionPrimary = /dev/raw/raw1
> >  quorumPartitionShadow = /dev/raw/raw2
> >end member0
> >start member1
> >start chan0
> >  name = rac2
> >  type = net
> >end chan0
> >  id = 1
> >  name = rac2
> >  powerSwitchIPaddr = rac2
> >  powerSwitchPortName = unused
> >  quorumPartitionPrimary = /dev/raw/raw1
> >  quorumPartitionShadow = /dev/raw/raw2
> >end member1
> >
> >[powercontrollers]
> >start powercontroller0
> >  IPaddr = rac1
> >  login = unused
> >  passwd = unused
> >  type = null
> >end powercontroller0
> >start powercontroller1
> >  IPaddr = rac2
> >  login = unused
> >  passwd = unused
> >  type = null
> >end powercontroller1
> >
> >[services]
> >start service0
> >  checkInterval = 30
> >start device0
> >start mount
> >start NFSexports
> >start directory0
> >start client0
> >  name = rac1
> >  options = rw
> >end client0
> >  name = /u04
> >end directory0
> >end NFSexports
> >  forceUnmount = yes
> >  fstype = ext3
> >  name = /u04
> >  options = rw,nosuid,sync
> >end mount
> >  name = /dev/sdd4
> >  sharename = None
> >end device0
> >  name = nfs_project
> >  preferredNode = rac2
> >  relocateOnPreferredNodeBoot = yes
> >end service0
> >------------------------------------
> >
> >I created a NFS share on /u04 using the following command cluadmin
> >
> >[root at rac1 root]# cluadmin
> >Wed Jul 20 10:02:20 EDT 2005
> >
> >You can obtain help by entering help and one of the 
> following commands:
> >
> >cluster     service        clear
> >help        apropos        exit
> >version         quit
> >cluadmin> service show
> >  1) state
> >  2) config
> >  3) services
> >service show what? 2
> >  0) nfs_project
> >  c) cancel
> >
> >Choose service: 0
> >name: nfs_project
> >preferred node: rac2
> >relocate: yes
> >monitor interval: 30
> >device 0: /dev/sdd4
> >  mount point, device 0: /u04
> >  mount fstype, device 0: ext3
> >  mount options, device 0: rw,nosuid,sync
> >  force unmount, device 0: yes
> >  samba share, device 0: None
> >NFS export 0: /u04
> >  Client 0: rac1, rw
> >cluadmin> service show state
> >=========================  S e r v i c e   S t a t u s  
> >========================
> >
> >                                         Last             
> Monitor  Restart
> >  Service        Status   Owner          Transition       
> Interval Count
> >  -------------- -------- -------------- ---------------- 
> -------- -------
> >  nfs_project       started  rac1        16:21:23 Jul 19  30       1
> >cluadmin>
> >
> >
> >And when I launched clustat, I expected this error message:
> >
> >clustat
> >Cluster Status Monitor (Fileserver Test Cluster)
> >07:46:05
> >Cluster alias: rac1
> >
> >===================== M e m b e r   S t a t u s ================
> >  Member         Status     Node Id    Power Switch
> >  -------------- ---------- ---------- ------------
> >  rac1           Up         0          Good
> >  rac2           Down    1          Unknown
> >
> >=================== H e a r t b e a t   S t a t u s ===============
> >  Name                           Type       Status
> >  ------------------------------ ---------- ------------
> >  rac1         <--> rac2         network    OFFLINE
> >
> >=================== S e r v i c e   S t a t u s ==================
> >                                  	     Last            
> Monitor      
> >Restart
> >Service         Status   Owner          Transition    Interval  Count
> >
> >  ------------- -------- ------------- ---------------- ------------
> >  nfs_project          started  rac1          16:07:42 Jul 
> 19  30           
> >  0
> >
> >
> >
> >And when I launched this command on RAC2:
> >mount -t nfs rac1:/u04 /u04
> >It list the following error message :
> >Mount: rac1:/u04 failed, reason given by server: Permission denied
> >
> >Can someone help me to fix this problem in this configuration?
> >
> >Thanks
> >
> >Cheers!
> >
> >Haydar
> >
> >
> >--
> >Linux-cluster mailing list
> >Linux-cluster at redhat.com
> >http://www.redhat.com/mailman/listinfo/linux-cluster
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> http://www.redhat.com/mailman/listinfo/linux-cluster
> 




More information about the Linux-cluster mailing list