[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]
hard locks / high memory
- From: Pete Huckelba <redhat stata com>
- To: enigma-list redhat com
- Subject: hard locks / high memory
- Date: Tue, 05 Nov 2002 08:12:10 -0600
I have three boxes, two problems (one serious problem partially resolved,
one question), all three boxes are running completely up2date, 2.4.18-17.7
kernels.
The first problem manifests itself when the enigma-boxes lock up
completely. Not allowing any keyboard nor mouse input, not responding to
ping, nor any tcp requests. User interaction at the time of lock-up would
be anything from vi'ing a file, to browsing the web. It took forever to
track down the problem since the boxes would crash in the user's office,
but when I tried to replicate the behavior after moving the box to my
office, it behaved like a dream. While it seemed to be completely random,
and viewing the system-logs did not lend much to diagnosing the problem, I
think I may have tracked it down. Both machines were connected to a 10
megabit hub (different hubs, different offices, different segments of the
network), one machine had a win2k box on the hub, the other has a two
Sparc, SunOS 5.8 and SunOS 5.1 boxes on its hub. Sometimes data would move
through eth0 fine, other times the machine would lock. A snipet from one of
the kernel logs shows:
Nov 4 16:42:14 sundown kernel: nfs: server marta OK
Nov 4 16:42:14 sundown last message repeated 3 times
Nov 4 17:09:54 sundown kernel: eepro100: wait_for_cmd_done timeout!
Nov 4 17:10:00 sundown last message repeated 16 times
Nov 4 17:10:04 sundown kernel: nfs: server marta not responding, still trying
Nov 4 17:10:04 sundown kernel: eepro100: wait_for_cmd_done timeout!
Nov 4 17:10:04 sundown kernel: nfs: server marta not responding, still trying
Nov 4 17:10:04 sundown kernel: eepro100: wait_for_cmd_done timeout!
Nov 4 17:10:04 sundown kernel: nfs: server marta not responding, still trying
Nov 4 17:10:04 sundown kernel: eepro100: wait_for_cmd_done timeout!
Nov 4 17:10:04 sundown kernel: nfs: server marta not responding, still trying
Nov 4 17:10:04 sundown kernel: eepro100: wait_for_cmd_done timeout!
Nov 4 17:10:40 sundown last message repeated 25 times
Nov 4 17:11:12 sundown last message repeated 13 times
Nov 4 17:11:14 sundown kernel: NETDEV WATCHDOG: eth0: transmit timed out
Nov 4 17:11:14 sundown kernel: eth0: Transmit timed out: status 0050 0cf0
at 17683/17743 command 000c0000.
Nov 4 17:11:23 sundown kernel: nfs: server marta OK
Shortly after this, the machine was locked hard. The other box would have a
similar entry showing smb activity shortly before a lock-up. I moved the
7.2 and Sparc boxes to a 100 megabit hub and the other 7.2 box off its hub
directly to the 100 megabit network and everything seems to be ok. While
the hardware in the boxes is completely different from motherboard to the
mouse, they do share one common factor. An lsmod shows they are using the
Intel eepro100 kernel module. With nothing left to blame, that is my
supposition. Has anyone else noticed similar behavior in this or other
versions of RH or other distros? A fairly extensive google did not show
anything remotely similar to my problem, but with the machines being up for
over 16 hours under high network and CPU load with no problems, I am fairly
confident in this diagnosis. Questions/comments/other suppositions are
welcome...
The next "problem" is more of a question. Has anyone successfully
recompiled a kernel with high-mem support? I am looking to find a way to
exceed the 2GB user-space limit imposed by default kernels. Googles have
shown instances where 8GB or more has been recognized by the system, but I
have not found any instances where someone was able to malloc more than
2GB. Is this possible/worth the experiment or has anyone been successful?
Thanks,
Pete
--------------------------
Pete Huckelba
Stata Corporation
4905 Lakeway Drive
College Station, TX 77845
(979)696-4600
[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]