Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
hda-errors: fs or hardware problem ?
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
Saillord
n00b
n00b


Joined: 09 Mar 2004
Posts: 35
Location: Bremen/Germany

PostPosted: Wed May 03, 2006 9:55 am    Post subject: hda-errors: fs or hardware problem ? Reply with quote

Hello you out there,

in the last days I am getting more and more often problems with my hda - by executing certain programs (e.g. trying to create an image for burning, p2p, ...), my harddisc is spinning up and the log shows errors:

e.g.:
Code:
Apr 26 22:55:59 swordfish hda: DMA disabled
Apr 26 22:56:00 swordfish ide0: reset: success
Apr 26 22:56:03 swordfish hda: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
Apr 26 22:56:03 swordfish hda: task_in_intr: error=0x01 { AddrMarkNotFound }, LBAsect=38018240, sector=38018240
Apr 26 22:56:03 swordfish ide: failed opcode was: unknown
Apr 26 22:56:07 swordfish hda: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error }
Apr 26 22:56:07 swordfish hda: task_in_intr: error=0x40 { UncorrectableError }, LBAsect=38018240, sector=38018240
Apr 26 22:56:07 swordfish ide: failed opcode was: unknown
Apr 26 22:56:07 swordfish end_request: I/O error, dev hda, sector 38018240


That doesn't sound good to me ;(
I already switched of/lowered down my hdparm-settings. Then I emerged the smartmontools, but the output doesn't tell me enough (why did the test stop...). Before going to buy a new harddrive I want to make sure that its the hardware to cause the trouble (and not e.g. the fs/reiserfs3.6). Is there another way to check the drive out (badblocks/reiserfsck within knoppix?).
I now attach the (verry long) output from smartmontools - Thanks a lot for your wise and knowledgeful help *

Rapha

swordfish saillord # smartctl -H /dev/hda
Code:
smartctl version 5.36 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED



swordfish saillord # smartctl -A /dev/hda
Code:
smartctl version 5.36 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   062    Pre-fail  Always       -       0
  2 Throughput_Performance  0x0005   100   100   040    Pre-fail  Offline      -       0
  3 Spin_Up_Time            0x0007   168   168   033    Pre-fail  Always       -       1
  4 Start_Stop_Count        0x0012   091   091   000    Old_age   Always       -       14749
  5 Reallocated_Sector_Ct   0x0033   078   078   005    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000b   100   100   067    Pre-fail  Always       -       0
  8 Seek_Time_Performance   0x0005   100   100   040    Pre-fail  Offline      -       0
  9 Power_On_Hours          0x0012   085   085   000    Old_age   Always       -       6576
 10 Spin_Retry_Count        0x0013   100   100   060    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   099   099   000    Old_age   Always       -       1788
191 G-Sense_Error_Rate      0x000a   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       113
193 Load_Cycle_Count        0x0012   081   081   000    Old_age   Always       -       192528
194 Temperature_Celsius     0x0002   152   152   000    Old_age   Always       -       36 (Lifetime Min/Max 10/54)
196 Reallocated_Event_Count 0x0032   069   069   000    Old_age   Always       -       822
197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always       -       3
198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age   Always       -       0
210 Unknown_Attribute       0x0023   100   100   001    Pre-fail  Always       -       0



and finally the test:
swordfish saillord # smartctl -l selftest /dev/hda
Code:
smartctl version 5.36 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed: read failure       60%      6576         38018008



and respective the output:
swordfish saillord # smartctl -l error /dev/hda
Code:
smartctl version 5.36 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF READ SMART DATA SECTION ===
SMART Error Log Version: 1
ATA Error Count: 519 (device log contains only the most recent five errors)
        CR = Command Register [HEX]
        FR = Features Register [HEX]
        SC = Sector Count Register [HEX]
        SN = Sector Number Register [HEX]
        CL = Cylinder Low Register [HEX]
        CH = Cylinder High Register [HEX]
        DH = Device/Head Register [HEX]
        DC = Device Command Register [HEX]
        ER = Error register [HEX]
        ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 519 occurred at disk power-on lifetime: 6573 hours (273 days + 21 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 20 c0 1c 44 e2  Error: UNC 32 sectors at LBA = 0x02441cc0 = 38018240

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 20 c0 1c 44 e2 00      03:30:35.625  READ DMA
  10 00 3f 00 00 00 e0 00      03:30:35.625  RECALIBRATE [OBS-4]
  c8 00 20 c0 1c 44 e2 00      03:30:33.000  READ DMA
  c8 00 20 c0 1c 44 e2 00      03:30:30.375  READ DMA
  c8 00 28 b8 1c 44 e2 00      03:30:27.675  READ DMA

Error 518 occurred at disk power-on lifetime: 6573 hours (273 days + 21 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  01 51 20 c0 1c 44 e2  Error: AMNF 32 sectors at LBA = 0x02441cc0 = 38018240

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 20 c0 1c 44 e2 00      03:30:33.000  READ DMA
  c8 00 20 c0 1c 44 e2 00      03:30:30.375  READ DMA
  c8 00 28 b8 1c 44 e2 00      03:30:27.675  READ DMA
  c8 00 30 b0 1c 44 e2 00      03:30:25.050  READ DMA
  c8 00 38 a8 1c 44 e2 00      03:30:22.425  READ DMA

Error 517 occurred at disk power-on lifetime: 6573 hours (273 days + 21 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  01 51 20 c0 1c 44 e2  Error: AMNF 32 sectors at LBA = 0x02441cc0 = 38018240

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 20 c0 1c 44 e2 00      03:30:30.375  READ DMA
  c8 00 28 b8 1c 44 e2 00      03:30:27.675  READ DMA
  c8 00 30 b0 1c 44 e2 00      03:30:25.050  READ DMA
  c8 00 38 a8 1c 44 e2 00      03:30:22.425  READ DMA
  c8 00 40 a0 1c 44 e2 00      03:30:19.800  READ DMA

Error 516 occurred at disk power-on lifetime: 6573 hours (273 days + 21 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 21 bf 1c 44 e2  Error: UNC 33 sectors at LBA = 0x02441cbf = 38018239

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 28 b8 1c 44 e2 00      03:30:27.675  READ DMA
  c8 00 30 b0 1c 44 e2 00      03:30:25.050  READ DMA
  c8 00 38 a8 1c 44 e2 00      03:30:22.425  READ DMA
  c8 00 40 a0 1c 44 e2 00      03:30:19.800  READ DMA
  c8 00 40 e0 1a 44 e2 00      03:30:19.800  READ DMA

Error 515 occurred at disk power-on lifetime: 6573 hours (273 days + 21 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 21 bf 1c 44 e2  Error: UNC 33 sectors at LBA = 0x02441cbf = 38018239

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 30 b0 1c 44 e2 00      03:30:25.050  READ DMA
  c8 00 38 a8 1c 44 e2 00      03:30:22.425  READ DMA
  c8 00 40 a0 1c 44 e2 00      03:30:19.800  READ DMA
  c8 00 40 e0 1a 44 e2 00      03:30:19.800  READ DMA
  c8 00 40 e0 19 44 e2 00      03:30:19.800  READ DMA





Thanks again - better bad news about the drive than ... * R
_________________
il n'y a pas de vent favorable pour celui qui ne sait pas où il va
Back to top
View user's profile Send private message
bunder
Bodhisattva
Bodhisattva


Joined: 10 Apr 2004
Posts: 5947

PostPosted: Wed May 03, 2006 10:14 am    Post subject: Reply with quote

Code:
Error: UNC 33 sectors at LBA = 0x02441cbf = 38018239


i would say its got a bad block (or two).
_________________
Neddyseagoon wrote:
The problem with leaving is that you can only do it once and it reduces your influence.

banned from #gentoo since sept 2017
Back to top
View user's profile Send private message
ectospasm
l33t
l33t


Joined: 19 Feb 2003
Posts: 711
Location: Mobile, AL, USA

PostPosted: Wed May 03, 2006 10:29 am    Post subject: Reply with quote

Bad sectors is probably it, but I had a lot of hard drive problems simply due to overheating. That can happen if you've got two drives stacked on top of one another in the case. The heat from the bottom one caused nasty problems in the top one, which was my primary drive (hda).

It's a shot in the dark, but make sure you have good airflow over your hard drives.
_________________
Join the adopt an unanswered post initiative today
Join the EFF!
Join the Drug Policy Alliance!
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum