pharoh Tux's lil' helper
Joined: 20 Mar 2004 Posts: 91 Location: Minnesota
|
Posted: Thu Jan 04, 2007 2:01 pm Post subject: kernel 2.6 software raid1 drive fails all io stops |
|
|
We have a 2.6.17 gentoo-sources kernel (no extra patches from what is in the portage tree) and once a month or 2 one of the drives fail... now my understanding is that the other will continue to function and I should replace the failed on the next convienient reboot. The problem is that all io stops and i get this message : Code: | Sep 18 19:52:41 hurricane command: cdb[0]=0x2a: 2a 00 06 47 6a 42 00 00 10 00
Sep 18 19:52:41 hurricane mptscsih: ioc0: target reset: SUCCESS (sc=c731ec80)
Sep 18 19:52:41 hurricane mptscsih: ioc0: Attempting host reset! (sc=c731ec80)
Sep 18 19:52:41 hurricane mptbase: Initiating ioc0 recovery
Sep 18 19:52:41 hurricane sd 2:1:1:0: scsi: Device offlined - not ready after error recovery
Sep 18 19:52:41 hurricane sd 2:1:1:0: scsi: Device offlined - not ready after error recovery
Sep 18 19:52:41 hurricane sd 2:1:1:0: scsi: Device offlined - not ready after error recovery
Sep 18 19:52:41 hurricane sd 2:1:1:0: scsi: Device offlined - not ready after error recovery
Sep 18 19:52:41 hurricane sd 2:1:1:0: scsi: Device offlined - not ready after error recovery
Sep 18 19:52:41 hurricane sd 2:1:1:0: scsi: Device offlined - not ready after error recovery
Sep 18 19:52:41 hurricane sd 2:1:1:0: scsi: Device offlined - not ready after error recovery
Sep 18 19:52:41 hurricane sd 2:1:1:0: scsi: Device offlined - not ready after error recovery
Sep 18 19:52:41 hurricane sd 2:1:1:0: scsi: Device offlined - not ready after error recovery
Sep 18 19:52:41 hurricane sd 2:1:1:0: scsi: Device offlined - not ready after error recovery
Sep 18 19:52:41 hurricane sd 2:1:1:0: scsi: Device offlined - not ready after error recovery
Sep 18 19:52:41 hurricane sd 2:1:1:0: scsi: Device offlined - not ready after error recovery
Sep 18 19:52:41 hurricane sd 2:1:1:0: scsi: Device offlined - not ready after error recovery
Sep 18 19:52:41 hurricane sd 2:1:1:0: rejecting I/O to offline device
Sep 18 19:52:41 hurricane raid1: Disk failure on sdb3, disabling device.
Sep 18 19:52:41 hurricane Operation continuing on 1 devices
Sep 18 19:52:41 hurricane sd 2:1:1:0: rejecting I/O to offline device
Sep 18 19:52:41 hurricane sd 2:1:1:0: rejecting I/O to offline device
Sep 18 19:52:41 hurricane sd 2:1:1:0: rejecting I/O to offline device
Sep 18 19:52:41 hurricane sd 2:1:1:0: rejecting I/O to offline device
Sep 18 19:52:41 hurricane sd 2:1:1:0: rejecting I/O to offline device
Sep 18 19:52:41 hurricane sd 2:1:1:0: rejecting I/O to offline device
Sep 18 19:52:41 hurricane sd 2:1:1:0: rejecting I/O to offline device
Sep 18 19:52:41 hurricane sd 2:1:1:0: rejecting I/O to offline device
Sep 18 19:52:41 hurricane sd 2:1:1:0: rejecting I/O to offline device
Sep 18 19:52:41 hurricane sd 2:1:1:0: rejecting I/O to offline device
Sep 18 19:52:41 hurricane sd 2:1:1:0: rejecting I/O to offline device
Sep 18 19:52:41 hurricane sd 2:1:1:0: rejecting I/O to offline device
Sep 18 19:52:41 hurricane RAID1 conf printout:
Sep 18 19:52:41 hurricane --- wd:1 rd:2
Sep 18 19:52:41 hurricane disk 0, wo:0, o:1, dev:sda3
Sep 18 19:52:41 hurricane disk 1, wo:1, o:0, dev:sdb3
Sep 18 19:52:41 hurricane RAID1 conf printout:
Sep 18 19:52:41 hurricane --- wd:1 rd:2
Sep 18 19:52:41 hurricane disk 0, wo:0, o:1, dev:sda3
Sep 18 20:37:46 hurricane mptscsih: ioc0: attempting task abort! (sc=f7af5980)
Sep 18 20:37:46 hurricane sd 2:0:0:0:
Sep 18 20:37:46 hurricane command: cdb[0]=0x35: 35 00 00 00 00 00 00 00 00 00
Sep 18 20:37:50 hurricane mptbase: ioc0: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000)
Sep 18 20:37:51 hurricane mptscsih: ioc0: task abort: SUCCESS (sc=f7af5980)
Sep 18 20:37:51 hurricane mptscsih: ioc0: attempting target reset! (sc=f7af5980)
Sep 18 20:37:51 hurricane sd 2:0:0:0:
Sep 18 20:37:51 hurricane command: cdb[0]=0x35: 35 00 00 00 00 00 00 00 00 00
Sep 18 20:37:51 hurricane mptscsih: ioc0: target reset: SUCCESS (sc=f7af5980)
Sep 18 20:37:51 hurricane mptscsih: ioc0: attempting bus reset! (sc=f7af5980)
Sep 18 20:37:51 hurricane sd 2:0:0:0:
Sep 18 20:37:51 hurricane command: cdb[0]=0x35: 35 00 00 00 00 00 00 00 00 00
Sep 18 20:37:54 hurricane mptscsih: ioc0: bus reset: SUCCESS (sc=f7af5980)
Sep 18 20:38:04 hurricane mptscsih: ioc0: Attempting host reset! (sc=f7af5980)
Sep 18 20:38:04 hurricane mptbase: Initiating ioc0 recovery |
now I am thinking a couple of things... is the LSI mpt driver no good in this kernel or is there something wrong with 2.6.17? because the drives ALLWAYS check out I bring them down and smart checks out, the seagate disk diags check out. I am at a loss here. any help would be appreciated. Also should I have continuous operation in a raid1 failure? or will the system lockup as this one does? _________________ Linux user number 361815 |
|