[Linux-cluster] GFS 6.0 crashing x86_64 machine

Adam Manthei amanthei at redhat.com
Mon Aug 2 16:21:23 UTC 2004


On Mon, Aug 02, 2004 at 09:06:48AM -0700, micah nerren wrote:
> Hi,
> 
> 
> On Mon, 2004-08-02 at 08:46, Adam Manthei wrote:
> > On Mon, Aug 02, 2004 at 07:48:02AM -0700, micah nerren wrote:
> > > 
> > > The system crashes. At the console, there are tons of system calls being
> > > listed, and at the bottom of the screen:
> > > 
> > > Code: 39 d0 75 f8 85 c9 74 10 8b 44 24 14 39 d0 74 08 8b 44 24 14
> > > Console Shuts up:
> > >    pid: 3547, lock_gulmd Not tainted
> > > RIP: 0010
> > > 
> > > 
> > > So... Any ideas on what may be causing this? 
> > 
> > Those "tons of system calls being listed" are really quite useful if not
> > necessary to tell you what the problem is.  My gut feeling is that there is
> > a stack overrun that is happening.
> 
> I could try to post them if anybody would find that useful. I will write
> all that down and attempt to post it coherently. Is there any way to
> capture that kind of info to a file?

The best way is to connect it to a serial console to grab the output.
Depending on the state of the machine, you may even be able to grab that
with 'dmesg' or the syslogs (although it's not likely to have made it to the
syslog).

> > > Has anybody used GFS 6.0 on this kernel rev on x86_64 and if so, 
> > > how did you get it to work?
> > 
> > Yup.  I just used the rpms. Perhaps you compiled it with debugging options
> > enabled?  (I don't know if that would make the stack bigger)
> 
> All I did was 'rpmbuild --rebuild GFS-6.0.0-1.2.src.rpm'
> 
> That created the following rpms:
> GFS-6.0.0-1.2.x86_64.rpm
> GFS-debuginfo-6.0.0-1.2.x86_64.rpm
> GFS-devel-6.0.0-1.2.x86_64.rpm
> GFS-modules-6.0.0-1.2.x86_64.rpm
> GFS-modules-smp-6.0.0-1.2.x86_64.rpm
> 
> Of those, I have the following actually installed:
> 
> GFS-modules-smp-6.0.0-1.2
> GFS-6.0.0-1.2
> 
> 
> Do you have any build instructions for getting them to work properly?

What you did makes sense to me.


> Could something built into my running kernel cause this? I am building a
> new kernel from source right now to see if the binary kernel rpm I used
> had some sort of problem.
> 
> Could it be related to the HBA I am using as well?

If it is a stack overflow, then yes, it _could_ be related, but I'm not
going to blame that just yet ;)


> Thanks!
> 
> Micah
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> http://www.redhat.com/mailman/listinfo/linux-cluster

-- 
Adam Manthei  <amanthei at redhat.com>




More information about the Linux-cluster mailing list