Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
smartctl: so many errors - bad disk or bad BIOS config?
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
VinzC
Watchman
Watchman


Joined: 17 Apr 2004
Posts: 5098
Location: Dark side of the mood

PostPosted: Sat Mar 10, 2007 4:21 pm    Post subject: smartctl: so many errors - bad disk or bad BIOS config? Reply with quote

Hi.

I've just run smartctl -t long on one of my hard disks on which I had errors previously. It's a MAXTOR 80GB ATA 133. Here's smartctl log:

smartctl -a /dev/hdb:
smartctl version 5.36 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Model Family:     Maxtor DiamondMax Plus 9 family
Device Model:     Maxtor 6Y080L0
Serial Number:    Y2Q2L20E
Firmware Version: YAR41BW0
User Capacity:    81,964,302,336 bytes
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   7
ATA Standard is:  ATA/ATAPI-7 T13 1532D revision 0
Local Time is:    Sat Mar 10 17:05:18 2007 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82)   Offline data collection activity
               was completed without error.
               Auto Offline Data Collection: Enabled.
Self-test execution status:      ( 113)   The previous self-test completed having
               the read element of the test failed.
Total time to complete Offline
data collection:        ( 241) seconds.
Offline data collection
capabilities:           (0x5b) SMART execute Offline immediate.
               Auto Offline data collection on/off support.
               Suspend Offline collection upon new
               command.
               Offline surface scan supported.
               Self-test supported.
               No Conveyance Self-test supported.
               Selective Self-test supported.
SMART capabilities:            (0x0003)   Saves SMART data before entering
               power-saving mode.
               Supports SMART auto save timer.
Error logging capability:        (0x01)   Error logging supported.
               No General Purpose Logging support.
Short self-test routine
recommended polling time:     (   2) minutes.
Extended self-test routine
recommended polling time:     (  37) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  3 Spin_Up_Time            0x0027   224   224   063    Pre-fail  Always       -       9135
  4 Start_Stop_Count        0x0032   253   253   000    Old_age   Always       -       776
  5 Reallocated_Sector_Ct   0x0033   252   252   063    Pre-fail  Always       -       14
  6 Read_Channel_Margin     0x0001   253   253   100    Pre-fail  Offline      -       0
  7 Seek_Error_Rate         0x000a   253   252   000    Old_age   Always       -       0
  8 Seek_Time_Performance   0x0027   251   248   187    Pre-fail  Always       -       57274
  9 Power_On_Minutes        0x0032   240   240   000    Old_age   Always       -       219h+35m
 10 Spin_Retry_Count        0x002b   253   252   157    Pre-fail  Always       -       0
 11 Calibration_Retry_Count 0x002b   253   252   223    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   252   252   000    Old_age   Always       -       782
192 Power-Off_Retract_Count 0x0032   253   253   000    Old_age   Always       -       0
193 Load_Cycle_Count        0x0032   253   253   000    Old_age   Always       -       0
194 Temperature_Celsius     0x0032   253   253   000    Old_age   Always       -       38
195 Hardware_ECC_Recovered  0x000a   253   252   000    Old_age   Always       -       3572
196 Reallocated_Event_Count 0x0008   243   243   000    Old_age   Offline      -       10
197 Current_Pending_Sector  0x0008   252   252   000    Old_age   Offline      -       14
198 Offline_Uncorrectable   0x0008   238   238   000    Old_age   Offline      -       15
199 UDMA_CRC_Error_Count    0x0008   199   199   000    Old_age   Offline      -       0
200 Multi_Zone_Error_Rate   0x000a   253   252   000    Old_age   Always       -       0
201 Soft_Read_Error_Rate    0x000a   253   247   000    Old_age   Always       -       0
202 TA_Increase_Count       0x000a   253   250   000    Old_age   Always       -       0
203 Run_Out_Cancel          0x000b   253   252   180    Pre-fail  Always       -       0
204 Shock_Count_Write_Opern 0x000a   253   252   000    Old_age   Always       -       0
205 Shock_Rate_Write_Opern  0x000a   253   252   000    Old_age   Always       -       0
207 Spin_High_Current       0x002a   253   252   000    Old_age   Always       -       0
208 Spin_Buzz               0x002a   253   252   000    Old_age   Always       -       0
209 Offline_Seek_Performnce 0x0024   192   187   000    Old_age   Offline      -       0
 99 Unknown_Attribute       0x0004   253   253   000    Old_age   Offline      -       0
100 Unknown_Attribute       0x0004   253   253   000    Old_age   Offline      -       0
101 Unknown_Attribute       0x0004   253   253   000    Old_age   Offline      -       0

SMART Error Log Version: 1
Warning: ATA error count 1997 inconsistent with error log pointer 5

ATA Error Count: 1997 (device log contains only the most recent five errors)
   CR = Command Register [HEX]
   FR = Features Register [HEX]
   SC = Sector Count Register [HEX]
   SN = Sector Number Register [HEX]
   CL = Cylinder Low Register [HEX]
   CH = Cylinder High Register [HEX]
   DH = Device/Head Register [HEX]
   DC = Device Command Register [HEX]
   ER = Error register [HEX]
   ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 1997 occurred at disk power-on lifetime: 4267 hours (177 days + 19 hours)
  When the command that caused the error occurred, the device was in an unknown state.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 08 17 07 2f f6  Error: UNC 8 sectors at LBA = 0x062f0717 = 103745303

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 08 17 07 2f f6 08      01:05:23.024  READ DMA
  ca 00 08 67 07 2f f6 08      01:05:23.008  WRITE DMA
  ca 00 08 47 07 2f f6 08      01:05:23.008  WRITE DMA
  ca 00 20 f7 06 2f f6 08      01:05:23.008  WRITE DMA
  ca 00 30 47 06 2f f6 08      01:05:23.008  WRITE DMA

Error 1996 occurred at disk power-on lifetime: 4267 hours (177 days + 19 hours)
  When the command that caused the error occurred, the device was in an unknown state.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 05 7f 07 2f f6  Error: UNC 5 sectors at LBA = 0x062f077f = 103745407

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 08 7f 07 2f f6 08      01:05:21.840  READ DMA
  c8 00 08 7f 07 2f f6 08      01:05:20.688  READ DMA
  c8 00 08 7f 07 2f f6 08      01:05:19.504  READ DMA
  c8 00 08 47 07 2f f6 08      01:05:19.488  READ DMA
  c8 00 18 5f 65 2f f6 08      01:05:19.488  READ DMA

Error 1995 occurred at disk power-on lifetime: 4267 hours (177 days + 19 hours)
  When the command that caused the error occurred, the device was in an unknown state.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 05 7f 07 2f f6  Error: UNC 5 sectors at LBA = 0x062f077f = 103745407

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 08 7f 07 2f f6 08      01:05:20.688  READ DMA
  c8 00 08 7f 07 2f f6 08      01:05:19.504  READ DMA
  c8 00 08 47 07 2f f6 08      01:05:19.488  READ DMA
  c8 00 18 5f 65 2f f6 08      01:05:19.488  READ DMA
  c8 00 80 bf 57 2f f6 08      01:05:19.472  READ DMA

Error 1994 occurred at disk power-on lifetime: 4267 hours (177 days + 19 hours)
  When the command that caused the error occurred, the device was in an unknown state.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 05 7f 07 2f f6  Error: UNC 5 sectors at LBA = 0x062f077f = 103745407

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 08 7f 07 2f f6 08      01:05:19.504  READ DMA
  c8 00 08 47 07 2f f6 08      01:05:19.488  READ DMA
  c8 00 18 5f 65 2f f6 08      01:05:19.488  READ DMA
  c8 00 80 bf 57 2f f6 08      01:05:19.472  READ DMA
  c8 00 20 9f 57 2f f6 08      01:05:19.472  READ DMA

Error 1993 occurred at disk power-on lifetime: 4267 hours (177 days + 19 hours)
  When the command that caused the error occurred, the device was in an unknown state.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 08 17 07 2f f6  Error: UNC 8 sectors at LBA = 0x062f0717 = 103745303

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 08 17 07 2f f6 08      01:05:17.008  READ DMA
  c8 00 10 f7 62 2f f6 08      01:05:16.992  READ DMA
  c8 00 10 8f 62 2f f6 08      01:05:16.992  READ DMA
  c8 00 08 17 07 2f f6 08      01:05:15.984  READ DMA
  c8 00 10 df 62 2f f6 08      01:05:15.968  READ DMA

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed: read failure       10%      4298         122804631

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

What puzzles me most is this:
Code:
Warning: ATA error count 1997 inconsistent with error log pointer 5

ATA Error Count: 1997

Is a so high number of errors the sign of a really bad drive or the result of possibly having selected a wrong disk access mode (e.g. LBA/non LBA or whatever)? I can't resolve myself to believe there can be so many errors.

Note after all I remember I got boot error messages at the very first days I bought the drive, like "CANNOT FIND SYSTEM. PRESS CTRL-ALT-DELETE TO REBOOT." I rebooted and the system loaded fine. I got such errors only a couple of times afterwards. Lately only I saw DriveSeekErrors and many bad sectors reported. I think I've lost about 4GB on the disk: after I reformated with ext3 the total size was 76GB instead of the initial 80GB.

Can it be due to a bad BIOS? Can a bad BIOS damage a disk?
_________________
Gentoo addict: tomorrow I quit, I promise!... Just one more emerge...
1739!
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum