[Linux-cluster] GFS 6.0 crashing x86_64 machine

Michael Conrad Tadpol Tilstra mtilstra at redhat.com
Mon Aug 9 15:12:08 UTC 2004


On Fri, Aug 06, 2004 at 04:31:55PM -0700, micah nerren wrote:
> So it appears to be specifically related to lock_gulm.

hrms, so no pushing this off onto someone else. oh well. ;)


> Anything else I should try?
well, it still pretty much looks like a stack overflow.  And looking at
the calling tree, there is not much left to take out of the stacks.  So
I guess we'll have to try making the stack shorter.

So, another patch.  This still works on my intels, give it a go and
lets see how it does on your opterons.

> I really appreciate all your help in debugging this!
np.


-- 
Michael Conrad Tadpol Tilstra
To be, or not to be, those are the parameters.
-------------- next part --------------
Index: gulm_core.c
===================================================================
RCS file: /cvs/GFS/locking/lock_gulm/kernel/gulm_core.c,v
retrieving revision 1.1.2.14
diff -u -b -B -r1.1.2.14 gulm_core.c
--- gulm_core.c	25 May 2004 20:11:23 -0000	1.1.2.14
+++ gulm_core.c	9 Aug 2004 15:11:19 -0000
@@ -51,13 +51,6 @@
 	}
 	gulm_cm.GenerationID = gen;
 
-	error = lt_login ();
-	if (error != 0) {
-		log_err ("lt_login failed. %d\n", error);
-		lg_core_logout (gulm_cm.hookup);	/* XXX is this safe? */
-		return error;
-	}
-
 	log_msg (lgm_Network2, "Logged into local core.\n");
 
 	return 0;
Index: gulm_fs.c
===================================================================
RCS file: /cvs/GFS/locking/lock_gulm/kernel/gulm_fs.c,v
retrieving revision 1.1.2.17
diff -u -b -B -r1.1.2.17 gulm_fs.c
--- gulm_fs.c	2 Aug 2004 16:12:39 -0000	1.1.2.17
+++ gulm_fs.c	9 Aug 2004 15:11:19 -0000
@@ -287,9 +287,11 @@
 			goto fail;
 		}
 
-		/* lt_login() is called after the success packet for cm_login()
-		 * returns.
-		 */
+		error = lt_login();
+		if (error != 0) {
+			log_err ("lt_login failed. %d\n", error);
+			goto fail;
+		}
 	}
       fail:
 	up (&start_stop_lock);
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20040809/9b594fcb/attachment.sig>


More information about the Linux-cluster mailing list