[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[Cluster-devel] cluster/gfs-kernel/src/dlm mount.c

CVSROOT:	/cvs/cluster
Module name:	cluster
Branch: 	RHEL4
Changes by:	teigland sourceware org	2008-01-14 15:35:30

Modified files:
	gfs-kernel/src/dlm: mount.c 

Log message:
	bz 324881
	It's easy to tell if you've hit this bug, because a message like this will
	always appear in /var/log/messages:
	SM: 02000378 ignoring service callback id=2000144 event=1324
	If you look at /proc/cluster/lock_dlm/debug on this node at this point,
	you'll see something like this at the end, which shows what the problem
	others_may_mount start_done 1322 b
	The event_id that others_may_mount uses when calling kcl_start_done()
	is incorrect; it's using 1322 when it should be 1324.
	I believe the fix is for others_may_mount() to read the event_id
	after taking the umount_lock semaphore which serializes
	others_may_mount() with a start callback from the lock_dlm thread.
	In this case, I believe the start callback is changing the event_id
	after others_may_mount reads it, and before othres_may_mount gets
	the umount_lock semaphore.


--- cluster/gfs-kernel/src/dlm/Attic/mount.c	2005/06/29 07:28:21
+++ cluster/gfs-kernel/src/dlm/Attic/mount.c	2008/01/14 15:35:30
@@ -316,11 +316,12 @@
+	down(&dlm->unmount_lock);
 	last_start = dlm->mg_last_start;
-	down(&dlm->unmount_lock);
 	set_bit(DFL_OTHERSMAYMOUNT, &dlm->flags);
 	/* There's been a start to add a second node while we've been

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]