Re: [Linux-cluster] NFS over GFS issue


Looks very similar to a bug that has been recently reported.
Unfortunately the bug has a number of internal view only flags, so can't
enable it for external access.  Your report is the first one to mention
NFS.  Is it possible to get your test cases?


On Wed, 2008-07-16 at 20:46 +0200, Mark Hlawatschek wrote:
> Hi,
> During some stress tests with NFS over GFS, I observed a strange problem.
> The test setup consists of two GFS cluster nodes (node1, node2) (RHEL4.6), 
> both serving the same NFS exports (/mnt/gfstest)
> The NFS exports are mounted by two NFS clients (client1, client2), whereas 
> client1 has mounted the NFS export from node1 and client2 has mounted the NFS 
> export from node2.
> During the stress test, client1 creates files into dir1 on the GFS and client2 
> created files into dir2 on the same GFS. Node1 continuously reads the files 
> created by client1 and client2. After some time (about 10 minutes) the 
> following error occurs on node1:
> GFS: fsid=axqa01:gfstest.0: fatal: assertion "!bd->bd_pinned 
> && !buffer_busy(bh)" failed
> GFS: fsid=axqa01:gfstest.0:   function = ail_empty_gl
> GFS: fsid=axqa01:gfstest.0:   file 
> = /builddir/build/BUILD/gfs-kernel-2.6.9-75/smp/src/gfs/dio.c, line = 383
> GFS: fsid=axqa01:gfstest.0:   time = 1216216523
> GFS: fsid=axqa01:gfstest.0: about to withdraw from the cluster
> GFS: fsid=axqa01:gfstest.0: waiting for outstanding I/O
> GFS: fsid=axqa01:gfstest.0: telling LM to withdraw
> lock_dlm: withdraw abandoned memory
> GFS: fsid=axqa01:gfstest.0: withdrawn
> Is there a workaround for this problem ? Is this a bug ?
> Thanks,
> Mark

