This patched version of dlm will probably resolve your issue, please try it.
See detailed description in the list earlier ( Subject: [Linux-cluster] [PATCH] dlm: faster dlm recovery )
And yes, mounts and umounts with unpatched dlm are proportional to N*N, where N is a number of files.
On Jan 13, 2012, at 00:50 , Scooter Morris wrote:
We've got a 4 node cluster running RHEL 6.2. As part of the cluster, we've got several gfs2 filesystem. We've often noticed that when we reboot a single node in the cluster, the gfs2 mounts take a long time -- eventually getting the 120 second delay messages. When we migrated to 6.2, the default mount script echoed the filesystem being mounted, and we discovered that the long delays were filesystem-dependent. In particular, two filesystems were causing all of the problems, both of which had >1M files in them. We also noticed that dlm_recoverd on one of the other nodes accumulates a lot of time when this is happening. Is this expected? Are there non-ilnear handshaking algorithms between the mounting node and the cluster that are dependent on the number of files?
Thanks in advance!
Linux-cluster mailing listLinux-cluster redhat com