Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
hard disk failing???
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
sobers_2002
Veteran
Veteran


Joined: 16 Mar 2004
Posts: 1128

PostPosted: Fri Apr 21, 2006 1:41 pm    Post subject: hard disk failing??? Reply with quote

I got the following output from the smartctl -l errors /dev/hda

Code:
smartctl version 5.1-18 Copyright (C) 2002-3 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART Error Log Version: 1
ATA Error Count: 45 (device log contains only the most recent five errors)
   CR = Command Register [HEX]
   FR = Features Register [HEX]
   SC = Sector Count Register [HEX]
   SN = Sector Number Register [HEX]
   CL = Cylinder Low Register [HEX]
   CH = Cylinder High Register [HEX]
   DH = Device/Head Register [HEX]
   DC = Device Command Register [HEX]
   ER = Error register [HEX]
   ST = Status register [HEX]
Timestamp = decimal seconds since the previous disk power-on.
Note: timestamp "wraps" after 2^32 msec = 49.710 days.

Error 45 occurred at disk power-on lifetime: 7358 hours
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 cb 41 84 e4

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Timestamp  Command/Feature_Name
  -- -- -- -- -- -- -- --   ---------  --------------------
  c8 00 20 c6 41 84 e4 00   26690.754  READ DMA
  c8 00 28 be 41 84 e4 00   26687.250  READ DMA
  c8 00 30 b6 41 84 e4 00   26683.696  READ DMA
  c8 00 38 ae 41 84 e4 00   26680.203  READ DMA
  c8 00 40 a6 41 84 e4 00   26676.687  READ DMA

Error 44 occurred at disk power-on lifetime: 7358 hours
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 cb 41 84 e4

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Timestamp  Command/Feature_Name
  -- -- -- -- -- -- -- --   ---------  --------------------
  c8 00 28 be 41 84 e4 00   26687.250  READ DMA
  c8 00 30 b6 41 84 e4 00   26683.696  READ DMA
  c8 00 38 ae 41 84 e4 00   26680.203  READ DMA
  c8 00 40 a6 41 84 e4 00   26676.687  READ DMA
  c8 00 48 9e 41 84 e4 00   26673.202  READ DMA

Error 43 occurred at disk power-on lifetime: 7358 hours
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 cb 41 84 e4

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Timestamp  Command/Feature_Name
  -- -- -- -- -- -- -- --   ---------  --------------------
  c8 00 30 b6 41 84 e4 00   26683.696  READ DMA
  c8 00 38 ae 41 84 e4 00   26680.203  READ DMA
  c8 00 40 a6 41 84 e4 00   26676.687  READ DMA
  c8 00 48 9e 41 84 e4 00   26673.202  READ DMA
  c8 00 50 96 41 84 e4 00   26669.684  READ DMA

Error 42 occurred at disk power-on lifetime: 7358 hours
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 cb 41 84 e4

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Timestamp  Command/Feature_Name
  -- -- -- -- -- -- -- --   ---------  --------------------
  c8 00 38 ae 41 84 e4 00   26680.203  READ DMA
  c8 00 40 a6 41 84 e4 00   26676.687  READ DMA
  c8 00 48 9e 41 84 e4 00   26673.202  READ DMA
  c8 00 50 96 41 84 e4 00   26669.684  READ DMA
  c8 00 58 8e 41 84 e4 00   26666.198  READ DMA

Error 41 occurred at disk power-on lifetime: 7358 hours
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 cb 41 84 e4

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Timestamp  Command/Feature_Name
  -- -- -- -- -- -- -- --   ---------  --------------------
  c8 00 40 a6 41 84 e4 00   26676.687  READ DMA
  c8 00 48 9e 41 84 e4 00   26673.202  READ DMA
  c8 00 50 96 41 84 e4 00   26669.684  READ DMA
  c8 00 58 8e 41 84 e4 00   26666.198  READ DMA
  c8 00 60 86 41 84 e4 00   26662.672  READ DMA



the health status reported it
Code:
smartctl version 5.1-18 Copyright (C) 2002-3 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED


So what's exactly happening here???

thanks
Saurabh
_________________
Pdict - dockable dictionary client for linux
FREE97WIN: Use this code on Dreamhost and you get $97 off !!
Back to top
View user's profile Send private message
eldad
Retired Dev
Retired Dev


Joined: 26 Jan 2006
Posts: 45
Location: Israel

PostPosted: Sun Apr 23, 2006 1:56 pm    Post subject: Reply with quote

It could signal something bad is about to happen, but it is still operational.

If you value your data, backup regularly, and if you wouldn't like surprises I advise you to RAID 1 your harddrive with another drive.
Back to top
View user's profile Send private message
eldad
Retired Dev
Retired Dev


Joined: 26 Jan 2006
Posts: 45
Location: Israel

PostPosted: Sun Apr 23, 2006 2:11 pm    Post subject: Reply with quote

And although RAIDing is a good idea anyway, can you post your attributes?

Code:
smartctl -A /dev/hda


here are mine:

Code:

smartctl version 5.33 [i686-pc-linux-gnu] Copyright (C) 2002-4 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0007   170   165   021    Pre-fail  Always       -       4025
  4 Start_Stop_Count        0x0032   100   100   040    Old_age   Always       -       205
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000b   200   200   051    Pre-fail  Always       -       0
  9 Power_On_Hours          0x0032   086   086   000    Old_age   Always       -       10707
 10 Spin_Retry_Count        0x0013   100   100   051    Pre-fail  Always       -       0
 11 Calibration_Retry_Count 0x0013   100   100   051    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       205
194 Temperature_Celsius     0x0022   111   103   000    Old_age   Always       -       39
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0012   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0012   200   200   000    Old_age   Always       -       0
199 UDMA_CRC_Error_Count    0x000a   200   253   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0009   200   200   051    Pre-fail  Offline      -       0


You should keep an eye on Raw_Read_Error_Rate and Reallocated_Sector_Ct, those tell you about a physical platter problem.
There are attributes specific to vendors and models, but I think that 1-99 is a shared standard.
Back to top
View user's profile Send private message
sobers_2002
Veteran
Veteran


Joined: 16 Mar 2004
Posts: 1128

PostPosted: Sun Apr 23, 2006 2:34 pm    Post subject: Reply with quote

Code:

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   073   069   034    Pre-fail  Always       -       184416223
  3 Spin_Up_Time            0x0003   070   070   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       2
  5 Reallocated_Sector_Ct   0x0033   098   098   036    Pre-fail  Always       -       80
  7 Seek_Error_Rate         0x000f   084   060   030    Pre-fail  Always       -       323433140
  9 Power_On_Hours          0x0032   066   066   000    Old_age   Always       -       29961
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       323
194 Temperature_Celsius     0x0022   048   058   000    Old_age   Always       -       48
195 Hardware_ECC_Recovered  0x001a   073   069   000    Old_age   Always       -       184416223
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       4
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       4
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0000   100   253   000    Old_age   Offline      -       0
202 TA_Increase_Count       0x0032   100   253   000    Old_age   Always       -       0


here....
so how are things??
_________________
Pdict - dockable dictionary client for linux
FREE97WIN: Use this code on Dreamhost and you get $97 off !!
Back to top
View user's profile Send private message
eldad
Retired Dev
Retired Dev


Joined: 26 Jan 2006
Posts: 45
Location: Israel

PostPosted: Sun Apr 23, 2006 2:46 pm    Post subject: Reply with quote

Code:

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   073   069   034    Pre-fail  Always       -       184416223
  5 Reallocated_Sector_Ct   0x0033   098   098   036    Pre-fail  Always       -       80
  7 Seek_Error_Rate         0x000f   084   060   030    Pre-fail  Always       -       323433140
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       4


I left all the reasons-to-be-worried.
First the the Raw_Read_Error_Rate which is high. Reallocated_Sector_Ct said you have 80 bad sectors which was automatically relocated by the drive.

I'd say replace this drive, or for the very least have another drive in RAID1 with it.
Back to top
View user's profile Send private message
sobers_2002
Veteran
Veteran


Joined: 16 Mar 2004
Posts: 1128

PostPosted: Sun Apr 23, 2006 3:07 pm    Post subject: Reply with quote

thanks! will do that
_________________
Pdict - dockable dictionary client for linux
FREE97WIN: Use this code on Dreamhost and you get $97 off !!
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum