Skyr n00b
Joined: 16 Mar 2005 Posts: 8
|
Posted: Fri Apr 21, 2006 9:36 am Post subject: Raid rebuild caught in endless loop |
|
|
Hi all,
I've added a second HD to my server (running kernel-2.6.15-gentoo-r1) and set up a mirror raid. Everything went fine (setup, boot, adding second disk, initial rebuild) until... after 4 hours, the new HD produced some IO errors
It seems that the kernel thinks that this failure isn't severe enough to remove the partition from the RAID set - so the recovery mechanism kicks in... until it runs on the defect sectors again. The recovery starts over again... and again... mdstat looks like this:
Code: | # cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid5] [multipath] [raid6] [raid10] [faulty]
md0 : active raid1 hda1[0] hdb1[1]
48064 blocks [2/2] [UU]
md1 : active raid1 hda3[2] hdb3[1]
38090048 blocks [2/1] [_U]
[====>................] recovery = 24.9% (9503936/38090048) finish=21.2min speed=22412K/sec
unused devices: <none> |
So I tried to remove the partition from the RAID set - but without luck:
Code: | # mdadm --manage /dev/md1 --remove /dev/hdb3
mdadm: hot remove failed for /dev/hdb3: Device or resource busy |
Using mdadm to mark hdb3 as faulty just triggers the recovery again... Any ideas how to break that endless loop? It puts quite some stress on the working HD and pushes the system to a constant load of 1... |
|