SMP Kernel Crash
Wade Hampton
wade.hampton at nsc1.net
Mon Jan 12 15:17:00 UTC 2004
I just experienced a kernel crash on an SMP machine with
FC 1.0. Prior to the lockup, this machine had been up
over 25 days without problems and with a moderate load
(moved many GB of data to/from NFS and to/from SMB).
System info:
Kernel 2.4.20-1115 (FC 1.0 stock kernel)
Supermicro dual XEON 2.2, hyperthread not enabled in BIOS
Soft raid (dual 120G HD)
Dual 1G ethernet
- one had several NFS and SMB mounted partitions (read and write)
- one has an NFS partition (Solaris 7 server)
2G RAM
The machine had both NFS and SMB mounts, but the NFS
server was down at the time (cable removed). Also, I did
df as a user and left it up yesterday.
This morning, the machine was locked up and would only
respond to pings. I could not login, hence had to hard reboot.
/var/log/messags reported:
... smb_request: result -104, setting invalid
... smb_retry: successful, new pid=9141, generation =2
This was repeated every hour, with generation 3, 4, 5, then 6.
That was the last message in /var/log/messages.
I found two threads on kernel lockups but from the info,
this is still a problem (last messages dated 1/8).
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=109497
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=113148
Note: I am loading the latest kernel and will retry, but I
really need a STABLE box....
Questions:
1) Should I move to RH Enterprise?
2) Should I use a stock 2.4.24 kernel (all I need is basic stuff:
soft RAID, e1000, NFS, SAMBA, CD-ROM)?
3) Do you think that the latest kernel will fix it?
4) Any help on how to test this (e.g., Stress?)?
Cheers,
--
Wade Hampton
More information about the fedora-list
mailing list