[Cluster-devel] fatal: assertion "!atomic_read(&gl->gl_ail_count)" failed

Josef Whiter jwhiter at redhat.com
Fri Feb 23 17:58:39 UTC 2007


On Fri, Feb 23, 2007 at 04:17:57PM -0000, David Craigon wrote:
> Hello,
> 
> I'm trying to use GFS2. I'm trying to use all latest parts- so I've
> tried it using Fedora 7 test1 with a check out from CVS, and I've also
> tried Fedora 6. I have an equal lack of success with both.
> 
> My set up is that I am trying to set up a simple cluster featuring two
> servers attached using open-iSCSI to a backend SAN. The iSCSI part works
> fine- I have the drive as a device on both computers. I'm using the
> iSCSI that comes with the linux distro. I've turned off SELinux. I'm
> using DLM locking 
> 
> When I've got both servers attached to it works for a short while (circa
> 10 seconds or so). 
> What I typically do is create files and then delete them from the two
> servers. After a while I get this....
> 
> Feb 23 15:54:28 a kernel: GFS2: fsid=: Trying to join cluster
> "lock_dlm", "alpha_cluster:a"
> Feb 23 15:54:28 a kernel: GFS2: fsid=alpha_cluster:a.0: Joined cluster.
> Now mounting FS...
> Feb 23 15:54:28 a kernel: GFS2: fsid=alpha_cluster:a.0: jid=0, already
> locked for use
> Feb 23 15:54:28 a kernel: GFS2: fsid=alpha_cluster:a.0: jid=0: Looking
> at journal...
> Feb 23 15:54:28 a kernel: GFS2: fsid=alpha_cluster:a.0: jid=0: Done
> Feb 23 15:54:55 a kernel: GFS2: fsid=alpha_cluster:a.0: fatal: assertion
> "!atomic_read(&gl->gl_ail_count)" failed
> Feb 23 15:54:55 a kernel: GFS2: fsid=alpha_cluster:a.0:   function =
> gfs2_meta_inval, file = fs/gfs2/meta_io.c, line = 101
> Feb 23 15:54:55 a kernel: GFS2: fsid=alpha_cluster:a.0: about to
> withdraw this file system
> Feb 23 15:54:55 a kernel: GFS2: fsid=alpha_cluster:a.0: telling LM to
> withdraw
> 
> At that point, this server can't look at the mount point anymore.
> 
> Can anyone offer any assistance?
>  

I hit this same bug as well, but haven't gone back to try and reproduce it yet.
Could you possibly try to narrow down an exact (or heck even general) sequence
of commands that will trigger the problem?  If not open a bugzilla and CC me to
it and I'll try to get some time next week to reproduce it again.  Thanks,

Josef




More information about the Cluster-devel mailing list