Diagnostic tool in Linux

Nifty Hat Mitch mitch48 at sbcglobal.net
Wed Oct 20 07:10:42 UTC 2004


On Wed, Oct 20, 2004 at 10:39:57AM +0530, Rajiv wrote:
> 
>         Are  there any opensource diagnostic tools in linux. My system is
>    getting  hanged  for  a frequency of 2 days. My system is working fine
>    for 2 days after that it is getting hanged. I would like to do testing
>    of harddisk, memory and all peripherals.

In addition to the iso utility disk image another poster pointed you
to, inspect the files in /var/log.

Linux and its device drivers are good diagnostic tools.... if you look
at the log files.

First double check the date and time.
Now place a marker in /var/log/messages thus:

    logger XXXXX==Rajiv=XXXXX

Note that /var/log will contain a big handful of log files.  Look at all
the files with current dates.  Look for rapidly growing or monster
sized files (wtmp will be biggish).

Again inspect the /var/log/messages file to see that this XXXXX line
is there.  You can do this "logger" trick any time you walk away from
the machine.  Note the time stamp...

My experience is that Fedora will log many clues in the various log
files.  Most of the hardware drivers will log hardware errors in a log
file for inspection.  Thus you want to look at the log files, 
and have a clue what time stamps are interesting.

What I would look for are the last messages in the various log files
that are close to the time that the machine appeared to locked up
or was reset.

My guess is that the 'system' may still be running but you have
some other problem that is keeping you from interacting with it.

It is possible to place a marker in the log file every 15 min with
a simple script.   You can leave this running in an iconified
terminal window.

  while /bin/true
  do
  echo sleeping
  sleep 900 # 900 seconds is 15 min.
  logger YYYYY-this-is-a-simple---is-the-log-alive-check-message
  echo ====`date`; sync
  done

Now after you are forced to push the reset you can check to see if
this little script was still running.

Simply note  the time when you push the reset button.  If the last
YYYYY messages was within 15 min of when the system was reset then the
display/keyboard was likely wedged and system was running.

When diagnosing stuff make no change that you cannot consult your
notes and undo!

Some systems have power saving timers in the BIOS.  In general you do
not want the BIOS to be putting the display, disk, or processor to
sleep without cooperation from the operating system.  Review the setup
guide for the box and make a list of all the settings.  If any of the
power saving bits are set turn them off for now and make a note of the
change.



-- 
	T o m  M i t c h e l l 
	May your cup runneth over with goodness and mercy
	and may your buffers never overflow.




More information about the fedora-list mailing list