Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Yet an other is my haddrive failing?
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
swoppe
Tux's lil' helper
Tux's lil' helper


Joined: 16 Aug 2004
Posts: 104
Location: Sweden

PostPosted: Thu Oct 23, 2008 1:11 pm    Post subject: Yet an other is my haddrive failing? Reply with quote

Greetings all.
I resently got my hands on some 1.5TB disks, but all is not well.

After creating a raid5 array and letting the thing sync over night (frekking 12h recovery time...) the computer suddenly hardlocked with no errors in the log. When I started the computer again I was greeted by a non functioning array and this in my log:
Code:
Oct 22 18:15:57 Angelica end_request: I/O error, dev sdd, sector 2930276992
Oct 22 18:15:57 Angelica md: super_written gets error=-5, uptodate=0
Oct 22 18:15:57 Angelica raid5: Disk failure on sdd, disabling device.
Oct 22 18:15:57 Angelica raid5: Operation continuing on 2 devices.

Okayy.. brand new disks failing... and then about 1h later:
Code:
Oct 22 19:16:01 Angelica sd 1:0:0:0: [sdb] 2930277168 512-byte hardware sectors (1500302 MB)
Oct 22 19:16:01 Angelica end_request: I/O error, dev sdb, sector 2930276992
Oct 22 19:16:01 Angelica md: super_written gets error=-5, uptodate=0
Oct 22 19:16:01 Angelica raid5: Disk failure on sdb, disabling device.
Oct 22 19:16:01 Angelica raid5: Operation continuing on 1 devices.


Um... after that i get large amounts of
Code:
Oct 22 19:16:01 Angelica metapage_read_end_io: I/O error
I suspect that is due to the array failing...

At this point I 0 the drives and try again, this time with smartd running. And I'm trying to provoke it to fail again by copying just about everything i have to it. And smartd gives me the following lines:
Code:
Oct 23 10:09:19 Angelica smartd[2865]: Device: /dev/sdb, SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 113 to 114
Oct 23 10:09:19 Angelica smartd[2865]: Device: /dev/sdb, SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 52 to 51
Oct 23 10:09:19 Angelica smartd[2865]: Device: /dev/sdc, SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 111 to 114
Oct 23 10:09:19 Angelica smartd[2865]: Device: /dev/sdc, SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 51 to 53
Oct 23 10:39:18 Angelica smartd[2865]: Device: /dev/sdb, SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 114 to 116
Oct 23 10:39:18 Angelica smartd[2865]: Device: /dev/sdb, SMART Prefailure Attribute: 7 Seek_Error_Rate changed from 100 to 60
Oct 23 10:39:18 Angelica smartd[2865]: Device: /dev/sdb, SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 51 to 53
Oct 23 10:39:19 Angelica smartd[2865]: Device: /dev/sdc, SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 114 to 115
Oct 23 10:39:19 Angelica smartd[2865]: Device: /dev/sdc, SMART Prefailure Attribute: 7 Seek_Error_Rate changed from 100 to 60
Oct 23 10:39:19 Angelica smartd[2865]: Device: /dev/sdc, SMART Usage Attribute: 189 High_Fly_Writes changed from 99 to 98
Oct 23 10:39:19 Angelica smartd[2865]: Device: /dev/sdc, SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 53 to 52
Oct 23 10:39:19 Angelica smartd[2865]: Device: /dev/sdd, SMART Usage Attribute: 189 High_Fly_Writes changed from 55 to 53
Oct 23 11:09:19 Angelica smartd[2865]: Device: /dev/sdb, SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 116 to 117
Oct 23 11:09:19 Angelica smartd[2865]: Device: /dev/sdc, SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 115 to 116
Oct 23 11:09:19 Angelica smartd[2865]: Device: /dev/sdd, SMART Usage Attribute: 189 High_Fly_Writes changed from 53 to 52
Oct 23 11:39:18 Angelica smartd[2865]: Device: /dev/sdb, SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 53 to 52
Oct 23 11:39:18 Angelica smartd[2865]: Device: /dev/sdc, SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 116 to 117
Oct 23 12:09:18 Angelica smartd[2865]: Device: /dev/sdb, SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 117 to 118
Oct 23 12:09:18 Angelica smartd[2865]: Device: /dev/sdc, SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 52 to 51
Oct 23 12:09:19 Angelica smartd[2865]: Device: /dev/sdd, SMART Usage Attribute: 189 High_Fly_Writes changed from 52 to 51
Oct 23 12:39:18 Angelica smartd[2865]: Device: /dev/sdb, SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 118 to 119
Oct 23 12:39:19 Angelica smartd[2865]: Device: /dev/sdc, SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 117 to 119
Oct 23 12:39:19 Angelica smartd[2865]: Device: /dev/sdc, SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 51 to 53
Oct 23 13:09:19 Angelica smartd[2865]: Device: /dev/sdb, SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 119 to 112
Oct 23 13:09:19 Angelica smartd[2865]: Device: /dev/sdb, SMART Usage Attribute: 190 Airflow_Temperature_Cel changed from 65 to 64
Oct 23 13:09:19 Angelica smartd[2865]: Device: /dev/sdb, SMART Usage Attribute: 194 Temperature_Celsius changed from 35 to 36
Oct 23 13:09:19 Angelica smartd[2865]: Device: /dev/sdc, SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 119 to 109
Oct 23 13:09:19 Angelica smartd[2865]: Device: /dev/sdc, SMART Usage Attribute: 190 Airflow_Temperature_Cel changed from 65 to 64
Oct 23 13:09:19 Angelica smartd[2865]: Device: /dev/sdc, SMART Usage Attribute: 194 Temperature_Celsius changed from 35 to 36
Oct 23 13:09:19 Angelica smartd[2865]: Device: /dev/sdc, SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 53 to 52
Oct 23 13:09:19 Angelica smartd[2865]: Device: /dev/sdd, SMART Usage Attribute: 189 High_Fly_Writes changed from 51 to 49
(There are logs of more from yesterday everning and night too).

So now for my questions to you: Did I just hit the jackpot by getting 2 brand new bad drives? How frequent should the read_error be on a totaly new drive? Oh and what is High_Fly_Writes?

/Anders

PS sorry for the code spam.
_________________
Programmink and dyslexia, they do not mix... - Pitr
Back to top
View user's profile Send private message
th0r
Tux's lil' helper
Tux's lil' helper


Joined: 06 Feb 2008
Posts: 117
Location: 67 65 6e 74 6f 6f

PostPosted: Thu Oct 23, 2008 5:35 pm    Post subject: Reply with quote

Error 5 is different in alot of systems. You may have insufficient cooling for all those drives. So they get too hot. Which it sort of looks like. But don't quote me on that. Could be just bad drives. food for thought.
Back to top
View user's profile Send private message
swoppe
Tux's lil' helper
Tux's lil' helper


Joined: 16 Aug 2004
Posts: 104
Location: Sweden

PostPosted: Thu Oct 23, 2008 5:41 pm    Post subject: Reply with quote

I don't think its a cooling problem, there is one fan infront of the drives and one behind.
_________________
Programmink and dyslexia, they do not mix... - Pitr
Back to top
View user's profile Send private message
th0r
Tux's lil' helper
Tux's lil' helper


Joined: 06 Feb 2008
Posts: 117
Location: 67 65 6e 74 6f 6f

PostPosted: Sat Oct 25, 2008 12:45 am    Post subject: Reply with quote

Any progress?
Back to top
View user's profile Send private message
billium
Apprentice
Apprentice


Joined: 22 Mar 2003
Posts: 185

PostPosted: Sun Oct 26, 2008 7:30 pm    Post subject: Reply with quote

You could always google for: make of disk drive fitness test
Most manufacturers have bootable drive fitness test disks even for sata never tested them on raid arrays though.

Billy
Back to top
View user's profile Send private message
snIP3r
l33t
l33t


Joined: 21 May 2004
Posts: 853
Location: germany

PostPosted: Fri Oct 31, 2008 11:04 am    Post subject: Reply with quote

hi!

hmmm, are you able to test the drives in another computer?
2 brand new disks failing can not be very probable, but possible :( i would do as billium suggested: get a harddrive test utility from the manufacturer and check the drive with this. then you can be sure that the drive is ok or not.

sorry, i also do not know what this "High_Fly_Writes" means...

HTH
snIP3r
_________________
Intel i3-4130T on ASUS P9D-X
Kernel 5.15.88-gentoo SMP
-----------------------------------------------
if your problem is fixed please add something like [solved] to the topic!
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum