[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: do I have to worry about these disk errors?



On Sun, 2009-06-07 at 09:31 -0500, Mike Chambers wrote:
> On Sun, 2009-06-07 at 16:41 +0300, Jussi Lehtola wrote:
> 
> > You can verify the diagnosis with
> > 
> > # smartctl -A /dev/sda
> > 
> > If you have a nonzero number in one or more of the following fields
> >   Reallocated_Sector_Ct
> >   Current_Pending_Sector
> >   Offline_Uncorrectable
> > then your hard drive is failing. Especially if you have something in  
> > the latter two you should panic.
> 
> I was getting a report myself via the gui applet that my sd was failing
> (this damn thing isn't even a year old), so I ran the command above..
> 
> === START OF READ SMART DATA SECTION ===
> SMART Attributes Data Structure revision number: 10
> Vendor Specific SMART Attributes with Thresholds:
> ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
> UPDATED  WHEN_FAILED RAW_VALUE
>   1 Raw_Read_Error_Rate     0x000f   105   094   006    Pre-fail  Always
> -       9758257
>   3 Spin_Up_Time            0x0003   096   094   000    Pre-fail  Always
> -       0
>   4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always
> -       16
>   5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always
> -       1
>   7 Seek_Error_Rate         0x000f   075   060   030    Pre-fail  Always
> -       39183414
>   9 Power_On_Hours          0x0032   099   099   000    Old_age   Always
> -       1247
>  10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always
> -       0
>  12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always
> -       15
> 187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always
> -       0
> 189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always
> -       0
> 190 Airflow_Temperature_Cel 0x0022   061   052   045    Old_age   Always
> -       39 (Lifetime Min/Max 37/41)
> 194 Temperature_Celsius     0x0022   039   048   000    Old_age   Always
> -       39 (0 20 0 0)
> 195 Hardware_ECC_Recovered  0x001a   065   061   000    Old_age   Always
> -       52230011
> 197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always
> -       0
> 198 Offline_Uncorrectable   0x0010   100   100   000    Old_age
> Offline      -       0
> 199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always
> -       0
> 200 Multi_Zone_Error_Rate   0x0000   100   253   000    Old_age
> Offline      -       0
> 202 TA_Increase_Count       0x0032   100   253   000    Old_age   Always
> -       0
> 
> Hopefully didn't get messed up and not wrapped.  Basically I see a 1
> where the Reallocated_Sector_Ct is.  Is this thing going bad?
> 
> -- 
> Mike Chambers
> Madisonville, KY
> 
> Fedora Project - Bugzapper, Tester, User, etc..
> miketc302 fedoraproject org
> 

That does not look good.  But I don't think that the problem originates
with the drive.  In fact, it looks like you have a problem with your
case design - the disk is claiming to be running a lot hotter than it
was designed for.  The resulting out-of-spec expansion of all of the
disk components is probably the real cause of all of the other errors.

Fix the heat problem.  If it is from a manufacturer who gives out a
downloadable low-level format tool, you may be able to then do a
low-level reformat of the drive, and it may still be usable.  Otherwise,
it is essentially a boat anchor, as you can no longer trust it.  It may
be ok if run cooler, but how are you going to tell?

Attachment: signature.asc
Description: This is a digitally signed message part


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]