[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: howto run badblocks using fedora rescue mode



> You might be better off using smartctl to have the disk scan itself.
> Modern disks remap bad blocks for you when possible. So it is possible
> for you to have a failing drive and not have badblocks spot any.
> You can usually have the disks do a self test while your system is running.

I have a bit of experience with this now, having been given two
laptops to fix with abused/damaged disks.

1) The short smartmon test is worthless.  Out of 2 dozen disks I've
   never seen a disk fail this.  The closest I've seen is 3 disks that
   didn't respond to any commands, but I got an error from smartctl
   itself, not the test.

2) The long smartmon test will flush out any pending bad blocks
   errors.  This is a very important number.  Run this test and then
   look at the "Current_Pending_Sector" count.

3) Any non-zero error in ether the remapped block count
   (Reallocated_Event_Count) or the pending remapped block count
   (Current_Pending_Sector) means the disk has a physical ding in it,
   probably from being dropped or bumped severely while running.

4) Any developed bad spots (see #3) are the kiss of death.  The disk
   is on its way out.  Backup and order a spare.  (This is also
   Google's finding, and they have a ton of disks to collect data
   over.)

5) Current_Pending_Sectors are a pain to have in that state.  They
   cause the long test to return errors until you force them to be
   remapped.  The simplest way to remap them is to force a write.  All
   of my bad blocks were in the free list, so a simple shell script to
   use up free space on the disk fixed them.  Ideally ext3 would have
   a security tool to forcefully clear the blocks on the free list,
   which would also have written the bad blocks and caused the disk
   blocks to be reallocated.

Example of a failing disk with a "ding" in the platter.

     [root acidophilus ~]# smartctl -a /dev/sda
     ...
     196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       11
     197 Current_Pending_Sector  0x0032   100   100   000    Old_age   Always       -       0
     ...

Before the remapping I had 7 back blocks on one list and 6 on the
other.  The long test would always fail citing the bad blocks.

-wolfgang
-- 
Wolfgang S. Rupprecht              http://www.full-steam.org/  (ipv6-only)
         You may need to config 6to4 to see the above pages.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]