[Linux-cluster] Unable to mount GFS on RHEL 3 U6

Fri Mar 17 23:09:12 UTC 2006

Magnus,

Try starting ccsd and lock_gulmd on all three servers.  Once these start
you should be able to see all three in gulm_tool nodelist localhost.  At
that point you should be able to mount your GFS pool vol's.  

Your lock cluster has to have a quorum of greater than half the servers
configured in cluster.ccs, so at least 2 in your case before it will
allow a GFS vol to be mounted.

Regards,

Britt 

________________________________

From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Magnus Andersen
Sent: Friday, March 17, 2006 4:54 PM
To: linux-cluster at redhat.com
Subject: [Linux-cluster] Unable to mount GFS on RHEL 3 U6

Hi All,

I've successfully installed and configured GFS on my three nodes, but
when I try to mount the filesystem the prompt hangs until I kill the
mount command.  All servers are running RHEL 3 AS/ES U6 with the
2.4.21-37.0.1.ELsmp kernel and are connected to a MSA1500 SAN via FC.
I've installed the following GFS rpms:

[root at oradw root]# rpm -qa | grep -i gfs
GFS-modules-6.0.2.27-0.1
GFS-modules-smp-6.0.2.27-0.1
GFS-6.0.2.27-0.1

Here is my pool configuration files and the output from pool_tool -s

[root at backup gfs]# cat cluster_cca.cfg
poolname cluster_cca
subpools 1
subpool 0 0 1
pooldevice 0 0 /dev/sda1
[root at backup gfs]# cat pool0.cfg
poolname pool_gfs1
subpools 1
subpool 0 0 1
pooldevice 0 0 /dev/sda2
[root at backup gfs]# cat pool1.cfg
poolname pool_gfs2
subpools 1
subpool 0 0 1
pooldevice 0 0 /dev/sdb
[root at backup gfs]# pool_tool -s
  Device                                            Pool Label
  ======                                            ==========
  /dev/pool/cluster_cca                       <- CCA device -> 
  /dev/pool/pool_gfs1                     <- GFS filesystem ->
  /dev/pool/pool_gfs2                     <- GFS filesystem ->
  /dev/cciss/c0d0                  <- partition information ->
  /dev/cciss/c0d0p1                    <- EXT2/3 filesystem -> 
  /dev/cciss/c0d0p2                          <- swap device ->
  /dev/cciss/c0d0p3                       <- lvm1 subdevice ->
  /dev/sda                         <- partition information ->
  /dev/sda1                                        cluster_cca 
  /dev/sda2                                          pool_gfs1
  /dev/sdb                                           pool_gfs2

Here are my ccs files.

[root at backup cluster_cca]# cat cluster.ccs
cluster {
        name = "cluster_cca"
        lock_gulm {
                servers = ["backup", "oradw", "gistest2"]
        }
}
[root at backup cluster_cca]# cat fence.ccs 
fence_devices {
        manual {
                agent = "fence_manual"
        }
}
[root at backup cluster_cca]# cat nodes.ccs
nodes {
        backup {
                ip_interfaces {
                        eth1 = "10.0.0.1"
                }
                fence {
                        man {
                                manual {
                                        ipaddr = " 10.0.0.1"
                                }
                        }
                }
        }
        oradw {
                ip_interfaces {
                        eth4 = " 10.0.0.2"
                }
                fence {
                        man {
                                manual {
                                        ipaddr = " 10.0.0.2"
                                }
                        }
                }
        }
        gistest2 {
                ip_interfaces {
                        eth0 = " 10.0.0.3"
                }
                fence {
                        man {
                                manual {
                                        ipaddr = " 10.0.0.3"
                                }
                        }
                }
        }
}

Here is the command I used to create the filesystem: 

gfs_mkfs -p lock_gulm -t cluster_cca:pool_gfs2 -j 10 /dev/pool/pool_gfs2

Mount command that hangs:

mount -t gfs /dev/pool/pool_gfs2 /gfs2

Here is the output I see in my messages log file.  I see the last 5
lines repeated for each time I tried to mount the filesystem.

Mar 17 15:47:05 backup ccsd[2645]: Starting ccsd 6.0.2.27
<http://6.0.2.27/> :
Mar 17 15:47:05 backup ccsd[2645]:  Built: Jan 30 2006 15:28:33 
Mar 17 15:47:05 backup ccsd[2645]:  Copyright (C) Red Hat, Inc.  2004
All rights reserved. 
Mar 17 15:48:10 backup lock_gulmd[2652]: Starting lock_gulmd 6.0.2.27
<http://6.0.2.27/> . (built Jan 30 2006 15:28:54) Copyright (C) 2004 Red
Hat, Inc.  All rights reserved. 
Mar 17 15:48:10 backup lock_gulmd[2652]: You are running in Fail-over
mode. 
Mar 17 15:48:10 backup lock_gulmd[2652]: I am (backup) with ip
(127.0.0.1 <http://127.0.0.1/> )
Mar 17 15:48:10 backup lock_gulmd[2652]: Forked core [2653]. 
Mar 17 15:48:11 backup lock_gulmd[2652]: Forked locktable [2654]. 
Mar 17 15:48:12 backup lock_gulmd[2652]: Forked ltpx [2655].
Mar 17 15:48:12 backup lock_gulmd_core[2653]: I see no Masters, So I am
Arbitrating until enough Slaves talk to me.
Mar 17 15:48:12 backup lock_gulmd_core[2653]: Could not send quorum
update to slave backup 
Mar 17 15:48:12 backup lock_gulmd_core[2653]: New generation of server
state. (1142628492484630)
Mar 17 15:48:12 backup lock_gulmd_LTPX[2655]: New Master at backup:
127.0.0.1 <http://127.0.0.1/> 
Mar 17 15:52:14 backup kernel: Lock_Harness 6.0.2.27 <http://6.0.2.27/>
(built Jan 30 2006 15:32:58) installed
Mar 17 15:52:14 backup kernel: GFS 6.0.2.27 <http://6.0.2.27/>  (built
Jan 30 2006 15:32:20) installed
Mar 17 15:52:15 backup kernel: Gulm 6.0.2.27 <http://6.0.2.27/>  (built
Jan 30 2006 15:32:54) installed
Mar 17 15:54:51 backup kernel: lock_gulm: ERROR cm_login failed. -512 
Mar 17 15:54:51 backup kernel: lock_gulm: ERROR Got a -512 trying to
start the threads. 
Mar 17 15:54:51 backup lock_gulmd_core[2653]: Error on xdr (GFS Kernel
Interface:127.0.0.1 <http://127.0.0.1/>  idx:3 fd:8):
(-104:104:Connection reset by peer) 
Mar 17 15:54:51 backup kernel: lock_gulm: fsid=cluster_cca:gfs1: Exiting
gulm_mount with errors -512 
Mar 17 15:54:51 backup kernel: GFS: can't mount proto = lock_gulm, table
= cluster_cca:gfs1, hostdata =

Result from gulm_tool:

[root at backup gfs]# gulm_tool nodelist backup
 Name: backup
  ip    = 127.0.0.1 <http://127.0.0.1/> 
  state = Logged in
  mode = Arbitrating
  missed beats = 0
  last beat = 1142632189718986
  delay avg = 10019686 
  max delay = 10019735

I'm a newbie to clusters and I have no clue where to look next.  If any
other information is needed let me know. 

Thanks,

-- 
Magnus Andersen
Systems Administrator / Oracle DBA
Walker & Associates, Inc. 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20060317/64322de7/attachment.htm>