Diagnostic tool in Linux
Nifty Hat Mitch
mitch48 at sbcglobal.net
Wed Oct 20 07:10:42 UTC 2004
On Wed, Oct 20, 2004 at 10:39:57AM +0530, Rajiv wrote:
>
> Are there any opensource diagnostic tools in linux. My system is
> getting hanged for a frequency of 2 days. My system is working fine
> for 2 days after that it is getting hanged. I would like to do testing
> of harddisk, memory and all peripherals.
In addition to the iso utility disk image another poster pointed you
to, inspect the files in /var/log.
Linux and its device drivers are good diagnostic tools.... if you look
at the log files.
First double check the date and time.
Now place a marker in /var/log/messages thus:
logger XXXXX==Rajiv=XXXXX
Note that /var/log will contain a big handful of log files. Look at all
the files with current dates. Look for rapidly growing or monster
sized files (wtmp will be biggish).
Again inspect the /var/log/messages file to see that this XXXXX line
is there. You can do this "logger" trick any time you walk away from
the machine. Note the time stamp...
My experience is that Fedora will log many clues in the various log
files. Most of the hardware drivers will log hardware errors in a log
file for inspection. Thus you want to look at the log files,
and have a clue what time stamps are interesting.
What I would look for are the last messages in the various log files
that are close to the time that the machine appeared to locked up
or was reset.
My guess is that the 'system' may still be running but you have
some other problem that is keeping you from interacting with it.
It is possible to place a marker in the log file every 15 min with
a simple script. You can leave this running in an iconified
terminal window.
while /bin/true
do
echo sleeping
sleep 900 # 900 seconds is 15 min.
logger YYYYY-this-is-a-simple---is-the-log-alive-check-message
echo ====`date`; sync
done
Now after you are forced to push the reset you can check to see if
this little script was still running.
Simply note the time when you push the reset button. If the last
YYYYY messages was within 15 min of when the system was reset then the
display/keyboard was likely wedged and system was running.
When diagnosing stuff make no change that you cannot consult your
notes and undo!
Some systems have power saving timers in the BIOS. In general you do
not want the BIOS to be putting the display, disk, or processor to
sleep without cooperation from the operating system. Review the setup
guide for the box and make a list of all the settings. If any of the
power saving bits are set turn them off for now and make a note of the
change.
--
T o m M i t c h e l l
May your cup runneth over with goodness and mercy
and may your buffers never overflow.
More information about the fedora-list
mailing list