[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[Linux-cluster] gnbd_export stops working after reboot



Hi list,

I intend to setup a Oracle9i RAC cluster using GFS 6.0 as the CFS. Because the SAN is not available atm so I decide to use GNBD instead. I have three IA64 servers running RHAS 3 update 6 called node1, node2 and node3. Node1 and node2 are GNBD clients and GFS nodes. Node3 is the GNBD server. I also use all of them as lock servers.

I followed the GFS 6.0 Administrator guide and encountered no problem until I tried to mount the GFS file system on node2. It took forever to run "mount -t gfs /dev/pool/pool0 /gfs -o acl". I killed the mount process and tried again on node1. This time it returned something like the error when you try to mount a unknown file system. I rmmod the gfs module and modprobe it again but still no luck. I checked against the startup procedure and found that although I had started lock_gulmd on all nodes but only node3 had a running instance. There was no sight of lock_gulmd on node1 and node2. I tried to start lock_gulmd again and after a few times, it got running just on node2 but mounting gfs still didnt work.

I didnt know what to do next so I decided to start over again. After chkconfig off and GFS related daemons, I restarted the servers (a bad habit from the Windows time :( ). After all the servers are up again, I got "gnbd_export error: create request failed : Connection refused" error when executing the following commands on node3 (in order to export device as GNBD):

# modprobe gndb_serv

# lsmod
[root db-svr-test-03 root]# lsmod
Module                  Size  Used by    Not tainted
gnbd_serv              74288   0  (unused)
lock_gulm             149872   0  [gnbd_serv]
lock_harness            7288   0  [lock_gulm]
....

# gnbd_export -d /dev/cciss/c0d0p4 -e cluster.cca
gnbd_export error: create request failed : Connection refused

As you can see below, gnbd_serv was running and listening on the default port, 14243:
# netstat -nat

[root db-svr-test-03 root]# netstat -nat
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address               Foreign Address             State
tcp        0      0 0.0.0.0:14243               0.0.0.0:*                   LISTEN
tcp        0      0 0.0.0.0:22                  0.0.0.0:*                   LISTEN
.....

I also placed a tcpdump -vv -i lo port 14243 on node3 and saw that there were some traffic when I re-executed "gnbd_export -d /dev/cciss/c0d0p4 -e cluster.cca". it even passed the threeway handshark procedure but while the client side was pushing data the server suddenly sent a F packet.

I even removed the GFS and GFS-modules RPM and reinstalled them but still no luck. What am I supposed to do now? Any help appreciated.

Regards,

--Thai Duong.




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]