[Linux-cluster] gnbd_export stops working after reboot

Thai Duong thaidn at gmail.com
Sat Nov 26 19:57:31 UTC 2005


Hi list,

I intend to setup a Oracle9i RAC cluster using GFS 6.0 as the CFS. Because
the SAN is not available atm so I decide to use GNBD instead. I have three
IA64 servers running RHAS 3 update 6 called node1, node2 and node3. Node1
and node2 are GNBD clients and GFS nodes. Node3 is the GNBD server. I also
use all of them as lock servers.

I followed the GFS 6.0 Administrator guide and encountered no problem until
I tried to mount the GFS file system on node2. It took forever to run "mount
-t gfs /dev/pool/pool0 /gfs -o acl". I killed the mount process and tried
again on node1. This time it returned something like the error when you try
to mount a unknown file system. I rmmod the gfs module and modprobe it again
but still no luck. I checked against the startup procedure and found that
although I had started lock_gulmd on all nodes but only node3 had a running
instance. There was no sight of lock_gulmd on node1 and node2. I tried to
start lock_gulmd again and after a few times, it got running just on node2
but mounting gfs still didnt work.

I didnt know what to do next so I decided to start over again. After
chkconfig off and GFS related daemons, I restarted the servers (a bad habit
from the Windows time :( ). After all the servers are up again, I got
"gnbd_export error: create request failed : Connection refused" error when
executing the following commands on node3 (in order to export device as
GNBD):

# modprobe gndb_serv

# lsmod
[root at db-svr-test-03 root]# lsmod
Module                  Size  Used by    Not tainted
gnbd_serv              74288   0  (unused)
lock_gulm             149872   0  [gnbd_serv]
lock_harness            7288   0  [lock_gulm]
....

# gnbd_export -d /dev/cciss/c0d0p4 -e cluster.cca
gnbd_export error: create request failed : Connection refused

As you can see below, gnbd_serv was running and listening on the default
port, 14243:
# netstat -nat

[root at db-svr-test-03 root]# netstat -nat
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address               Foreign Address
State
tcp        0      0 0.0.0.0:14243               0.0.0.0:*
LISTEN
tcp        0      0 0.0.0.0:22                  0.0.0.0:*
LISTEN
.....

I also placed a tcpdump -vv -i lo port 14243 on node3 and saw that there
were some traffic when I re-executed "gnbd_export -d /dev/cciss/c0d0p4 -e
cluster.cca". it even passed the threeway handshark procedure but while the
client side was pushing data the server suddenly sent a F packet.

I even removed the GFS and GFS-modules RPM and reinstalled them but still no
luck. What am I supposed to do now? Any help appreciated.

Regards,

--Thai Duong.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20051127/22a05db5/attachment.htm>


More information about the Linux-cluster mailing list