View previous topic :: View next topic |
Author |
Message |
djhomeless Tux's lil' helper
Joined: 31 Dec 2003 Posts: 98 Location: Hanover, NH
|
Posted: Sat Jun 26, 2010 3:34 pm Post subject: RAID 0+1 Resync Kernel Panic |
|
|
About a week ago I realized that my machine was moving pretty slowly.
I checked on the raid array and it was resyncing... and then later locked up while still resyncing.
Now when I boot it up and leave it it will still try and resync. but throw a kernel panic a few hours later and lock up.
The Array is 4x 500 GB WD drives. in a Raid 0+1...
If I boot it up it will work for a few hours, but then when it is resyncing, it will hard lock up in 1-2 hours.
Not sure how I should proceed from here?
Is there a way to stop it from resyncing so I can safely copy over to an external drive?
Or should I remove the bad disk temporarily, then copy over and RMA the drive?
Code: |
ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata6.00: BMDMA stat 0x24
ata6.00: cmd 25/00:00:82:64:d6/00:04:00:00:00/e0 tag 0 dma 524288 in
res 51/01:00:0e:68:d6/01:00:00:00:00/e0 Emask 0x1 (device error)
ata6.00: status: { DRDY ERR }
ata6.00: configured for UDMA/133
ata6: EH complete
ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata6.00: BMDMA stat 0x24
ata6.00: cmd 25/00:00:82:60:09/00:04:01:00:00/e0 tag 0 dma 524288 in
res 51/01:00:d1:62:09/01:00:01:00:00/e0 Emask 0x1 (device error)
ata6.00: status: { DRDY ERR }
ata6.00: configured for UDMA/133
ata6: EH complete
ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
ata6.00: cmd 25/00:00:02:f0:e9/00:04:05:00:00/e0 tag 0 dma 524288 in
res 40/00:00:d1:62:09/01:00:01:00:00/e0 Emask 0x4 (timeout)
ata6.00: status: { DRDY }
ata6: hard resetting link
Clocksource tsc unstable (delta = 4398042602174 ns)
ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata6.00: configured for UDMA/133
ata6: EH complete
|
|
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54643 Location: 56N 3W
|
Posted: Sat Jun 26, 2010 3:44 pm Post subject: |
|
|
djhomeless,
Your error can be a bad cable or a bad drive.
Try a new cable first but do not swap them round in your raid set, that may just trash the data on another drive.
If you know which cable is attached to ata6, just unplugging and reconnecting the cable both ends may help.
Failing the drive will restore normal operation in degraded mode so you can copy the data off the raid set. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
djhomeless Tux's lil' helper
Joined: 31 Dec 2003 Posts: 98 Location: Hanover, NH
|
Posted: Sat Jun 26, 2010 4:06 pm Post subject: |
|
|
Thanks for the help, i swapped out the hopefully offending cable and am trying to resync it again.
If this doesnt work by failing the drive do you just mean removing it? or something like this?
mdadm --manage /dev/md3 --fail /dev/sdd3
thanks again |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54643 Location: 56N 3W
|
Posted: Sat Jun 26, 2010 4:36 pm Post subject: |
|
|
djhomeless,
Either works.
The idea is to make in unavailablr to the raid array.
If you leave the dead drive connected but remove it from the array, you can try a few tests on it, like reading its SMART data with smartmontools, copying it to /dev/null to check for read errors, running badbocks ans allowing it to write to the drive.
Even getting the WD disk test program to run on it.
Be aware that some of the tests will destroy your data - so don't do them to the wrong drive. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
|