Re: [Linux-cluster] RHCS TestCluster with ScientificLinux 5.2

Hello Lon,

only to explain, if I mount the clients with "nolock" the error messages of course go away, but I do not see a real reason to mount nfs filesystems with nolock, because the clients and servers kernels are all 2.6.18-128.1.10.el5 ...

Cheers, Rainer

Rainer Schwierz wrote:
Hello Lon,

could you please go a little bit more in detail?.
Each NFS filesystem has been exported only once, that means e.g. when I have activated service_nfs_home, service_nfs_home_fast was not active. In all the tests IPtables on both servers has been stopped and all TCP and UDP traffic on the clients to and from the server's real IP and the servers service IPs have been accepted. I also see no significant difference to the document
"The Red Hat Cluster Suite NFS Cookbook" by Bob Peterson.

Thanks in advance & Cheers, Rainer

Lon Hohberger wrote:
On Thu, 2009-09-17 at 07:30 +0200, Rainer Schwierz wrote:

hmm, meanwhile the fence_apc problem is fixed by a more recent version of fence_apc.

But the nfs lock problem is still open. Does it mean I definitely should not use ScientificLinux and switch to Fedora 11 or RHEL5.4?

When doing a multi-export of the same NFS file system on top of GFS,
lock recovery will not work correctly - there's no way to prevent a new
GFS lock from being taken after a failure but before NFS has sent the
lock reclaim notifications, nor is there a way for GFS to respect the
NFS lock reclaim grace period.

I do not know why you would have this particular problem, though - locks
shouldn't randomly "not work at all" just because you take them from a
service IP address vs. the host's real IP.  Maybe there's some IPtables
firewall rule in place ?

-- Lon

