Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
raid device invalid after boot... ?
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
FuzzyOne
n00b
n00b


Joined: 08 Mar 2004
Posts: 21

PostPosted: Tue Apr 06, 2004 2:57 pm    Post subject: raid device invalid after boot... ? Reply with quote

2 SATA drives as RAID1 on ASUSP4C800 (Promise), gentoo 2004.0, 2.6.3-gentoo-r1 kernel with RAID support compiled in. manual creation of raid is ok:

handling MD device /dev/md0
analyzing super-block
disk 0: /dev/sdb1, 245111706kB, raid superblock at 245111616kB
disk 1: /dev/sdc1, 245111706kB, raid superblock at 245111616kB

cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sdc1[1] sdb1[0]
245111616 blocks [2/2] [UU]
[>....................] resync = 1.4% (3450672/245111616) finish=410.4min speed=9811K/sec

manual mounting and disk access is ok. but after i reboot the raid is invalid:

autodetecting RAID... trying md0: invalid
/dev/md0 is not a RAID0 or LINEAR

(which is weird because it's configured as RAID1)

and when i try to access it:
/dev/md0: Invalid argument
mount: /dev/md0: can't read superblock

but i can recreate the raid manually (mkraid) just fine without any data loss.

/etc/raidtab:
raiddev /dev/md0
raid-level 1
nr-raid-disks 2
chunk-size 32
persistent-superblock 1
device /dev/sdb1
raid-disk 0
device /dev/sdc1
raid-disk 1

what am i missing?
Back to top
View user's profile Send private message
toofastforyahuh
Apprentice
Apprentice


Joined: 18 May 2004
Posts: 172

PostPosted: Sat May 22, 2004 9:50 pm    Post subject: same problem. please help! Reply with quote

I have almost the same problem on ASUS SK8V, 2.6.6 vanilla and 2.6.5-gentoo-r1 on amd64. I also have RAID compiled in and can build and manually mount the RAID1 just fine, but bootup fails in the init process.

Specifically the init script brings me to the error (type password or control D) prompt when raidstart fails.

raidstart does not like /dev/md0 during init. raidstart -a produces "/dev/md0: Invalid argument". I can do a raidstop /dev/md0 OK, but I cannot subsequently to a raidstart -a.


If I mv /etc/raidtab /etc/raidtab.old I can eventually get through the boot process. At this point if I login and mv /etc/raidtab.old /etc/raidtab and try raidstart -a it works and I can mount it manually!

What's wrong with the init process? Thanks!
Back to top
View user's profile Send private message
toofastforyahuh
Apprentice
Apprentice


Joined: 18 May 2004
Posts: 172

PostPosted: Sun May 23, 2004 6:37 pm    Post subject: Reply with quote

I should also note that this is a RAID 1 array with 2 SATA drives on the Promise controller. The array is *not* defined in BIOS because this is software RAID. Also mkraid needed the --really-force option to work, but I had no other problems during the procedure. I followed the TLDP software RAID howto and also this howto:
[url]http://www.siliconvalleyccie.com/linux-adv/raid.htm
[/url]

I even used -c -c during the ext3 formatting to make sure the drives worked OK.

The drives appear as /dev/sdc1 and /dev/sdd1. It's strange because this appears to work manually but the init scripts die on it.

I should note that the log says things like:
md: raidstart(pid 7965) used deprecated START_ARRAY ioctl. This will not be supported beyond 2.6

Basically the init scripts call raidstart /dev/md0 during boot but for some reason it fails then. with:
/dev/md0: Invalid argument

However, if I disable RAID during boot and then restore my /etc/raidtab and /etc/fstab afterward and raidstart manually it works perfectly. What gives?
Back to top
View user's profile Send private message
senter
n00b
n00b


Joined: 23 May 2004
Posts: 1
Location: Sweden

PostPosted: Mon May 24, 2004 12:22 pm    Post subject: Reply with quote

have you set the partition type to fd, Linux raid autodetect, on both partitions ?
Back to top
View user's profile Send private message
toofastforyahuh
Apprentice
Apprentice


Joined: 18 May 2004
Posts: 172

PostPosted: Mon May 24, 2004 7:35 pm    Post subject: Reply with quote

senter wrote:
have you set the partition type to fd, Linux raid autodetect, on both partitions ?


Yes, I followed the procedure here to the letter
[url]http://www.siliconvalleyccie.com/linux-adv/raid.htm
[/url]

/dev/sdc1 and /dev/sdd1 are fd type in fdisk.
Code:

# fdisk -l | grep sd
Disk /dev/sdc: 160.0 GB, 160041885696 bytes
/dev/sdc1               1       19457   156288352   fd  Linux raid autodetect
Disk /dev/sdd: 160.0 GB, 160041885696 bytes
/dev/sdd1               1       19457   156288321   fd  Linux raid autodetect


Also my raidtab has the persistent-superblock set to 1. I did compile raid into the kernel, so I shouldn't (and can't) load any raid1 modules.
My raidtab is exceedingly simple:
Code:

raiddev /dev/md0
     raid-level 1
     nr-raid-disks      2
     nr-spare-disks     0
     chunk-size 4
     persistent-superblock      1
     device     /dev/sdc1
     raid-disk  0
     device     /dev/sdd1
     raid-disk  1


mkraid /dev/md0 did not work. It aborted without any logfile messages. I had to use mkraid --really-force /dev/md0 for it to work.

dmesg output (from a normal, nonraid bootup) shows nothing odd regarding SCSI or RAID personalities:
Code:

md: linear personality registered as nr 1
md: raid0 personality registered as nr 2
md: raid1 personality registered as nr 3
md: raid5 personality registered as nr 4
raid5: measuring checksumming speed
   generic_sse:  7264.000 MB/sec
raid5: using function: generic_sse (7264.000 MB/sec)
raid6: int64x1   1980 MB/s
raid6: int64x2   2992 MB/s
raid6: int64x4   3199 MB/s
raid6: int64x8   2058 MB/s
raid6: sse2x1    1234 MB/s
raid6: sse2x2    2347 MB/s
raid6: sse2x4    3152 MB/s
raid6: using algorithm sse2x4 (3152 MB/s)
md: raid6 personality registered as nr 8
md: multipath personality registered as nr 7
md: md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27

md: Autodetecting RAID arrays.
md: autorun ...
md: ... autorun DONE.

sata_promise version 0.92

scsi2 : sata_promise
  Vendor: ATA       Model: ST3160023AS       Rev: 1.02
  Type:   Direct-Access                      ANSI SCSI revision: 05
SCSI device sdc: 312581808 512-byte hdwr sectors (160042 MB)
SCSI device sdc: drive cache: write through
 /dev/scsi/host1/bus0/target0/lun0: p1
Attached scsi disk sdc at scsi1, channel 0, id 0, lun 0
Attached scsi generic sg2 at scsi1, channel 0, id 0, lun 0,  type 0
  Vendor: ATA       Model: ST3160023AS       Rev: 1.02
  Type:   Direct-Access                      ANSI SCSI revision: 05
SCSI device sdd: 312581808 512-byte hdwr sectors (160042 MB)
SCSI device sdd: drive cache: write through
 /dev/scsi/host2/bus0/target0/lun0: p1
Attached scsi disk sdd at scsi2, channel 0, id 0, lun 0
Attached scsi generic sg3 at scsi2, channel 0, id 0, lun 0,  type 0


At this point--if I restore raidtab and add the fstab line--I can raidstart /dev/md0 and mount it as per my fstab without problem. It works fine.

But if I reboot with this fstab line and raidtab, init fails at this script:
/etc/init.d/checkfs

Code:

                                if [ "${retval}" -gt 0 -a -x /sbin/raidstart ]
                                then
                                        /sbin/raidstart "${i}"
                                        retval=$?
                                fi
........
                                if [ "${retval}" -gt 0 ]
                                then
                                        rc=1
                                        eend ${retval}
                                else
                                        ewend ${retval}
                                fi
                        fi
                done

                # A non-zero return means there were problems.
                if [ "${rc}" -gt 0 ]
                then
                        echo
                        eerror "An error occurred during the RAID startup"
                        eerror "Dropping you to a shell; the system will reboot"                        eerror "when you leave the shell."
                        echo; echo
                        /sbin/sulogin ${CONSOLE}
                        einfo "Unmounting filesystems"
                        /bin/mount -a -o remount,ro &>/dev/null
                        einfo "Rebooting"



And indeed, if I give my root password and try to raidstart /dev/md0 (or even raidstart -a) at the bash prompt I get:
/dev/md0: Invalid argument

So the behavior is strange. I can start the RAID manually, but init scripts cannot do it automatically. This happens both on my 2.6.5-gentoo-r1 (with initrd) and 2.6.6 kernels (without a ramdisk). The system is amd64 on ASUS SK8V motherboard, with both drives on the Promise controller, but there is no conflicting Promise RAID array set up in BIOS.

Also, I am not trying to boot from the RAID. The boot disk is a 3rd hard drive on another controller.
Back to top
View user's profile Send private message
Donny
n00b
n00b


Joined: 06 Jun 2004
Posts: 59

PostPosted: Fri Jun 11, 2004 11:18 pm    Post subject: Reply with quote

I have exact the same "errors" as FuzzyOne and toofastforyahuh on an ASUS K8V SE De luxe.
The raid works fine (installed it following the documents from tldp)
But after boot it fails on the /dev/md0

When I comment #raiddev /dev/md0 out in /etc/raidtab it boots without a problem exept no raid 8O
So after starting the raid manual again all is working fine.

Does anyone no what I miss or doing wrong?
genkernel -> 2.6.3-gentoo-r2
2004.0
raidtab is the same as FuzzyOne
error (type password or control D) prompt when raidstart fails on boot

Code:

fdisk -l

Disk /dev/hda: 41.1 GB, 41110142976 bytes
16 heads, 63 sectors/track, 79656 cylinders
Units = cylinders of 1008 * 512 = 516096 bytes

   Device Boot    Start       End    Blocks   Id  System
/dev/hda1   *         1        63     31720+  83  Linux
/dev/hda2            64      1056    500472   82  Linux swap
/dev/hda3          1057     79656  39614400   83  Linux

Disk /dev/sda: 122.9 GB, 122942324736 bytes
255 heads, 63 sectors/track, 14946 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot    Start       End    Blocks   Id  System
/dev/sda1             1     14946 120053713+  fd  Linux raid autodetect

Disk /dev/sdb: 122.9 GB, 122942324736 bytes
255 heads, 63 sectors/track, 14946 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot    Start       End    Blocks   Id  System
/dev/sdb1             1     14946 120053713+  fd  Linux raid autodetect

Code:
cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid5] [multipath] [raid6]
md0 : active raid0 sda1[0] sdb1[1]
      240107264 blocks 32k chunks

unused devices: <none>

Code:
cat /etc/raidtab
raiddev /dev/md0
raid-level 0
nr-raid-disks 2
persistent-superblock 1
chunk-size 32
device /dev/sda1
raiddisk 0
device /dev/sdb1
raid-disk 1

Code:
/dev/hda1               /boot           ext3            noauto,noatime          1 2
/dev/hda3               /               reiserfs        noatime                 0 1
/dev/hda2               none            swap            sw                      0 0
/dev/cdroms/cdrom0      /mnt/cdrom      auto            noauto,ro,user          0 0

none                    /proc           proc            defaults                0 0
none                    /dev/shm        tmpfs           defaults                0 0

/dev/md0                /raid           reiserfs        noatime                 0 1

Thanks for reading all this.
Back to top
View user's profile Send private message
toofastforyahuh
Apprentice
Apprentice


Joined: 18 May 2004
Posts: 172

PostPosted: Sat Jun 12, 2004 10:04 pm    Post subject: Reply with quote

I still have this problem.
Did anything about software raid change in 2.6.x? Do the raidtools need updating?
Is it OK to have /dev/md0 instead of /dev/md/0?

I'm really at a loss here.
Back to top
View user's profile Send private message
Donny
n00b
n00b


Joined: 06 Jun 2004
Posts: 59

PostPosted: Sun Jun 13, 2004 12:26 am    Post subject: Reply with quote

I used mdadm and not raidtools, but got same error.
The feeling I have is that it has something to do with /dev/md0 but what I have really no idea I am lost.
Back to top
View user's profile Send private message
Donny
n00b
n00b


Joined: 06 Jun 2004
Posts: 59

PostPosted: Mon Jun 14, 2004 10:32 am    Post subject: Reply with quote

Has anyone an idea in what direction to search to solve this problem?
Back to top
View user's profile Send private message
toofastforyahuh
Apprentice
Apprentice


Joined: 18 May 2004
Posts: 172

PostPosted: Mon Jun 21, 2004 9:26 am    Post subject: I think I have a fix Reply with quote

I think I fixed it!

First let me preface this with I find the gentoo init process confusing compared to ye olde /etc/rc.d/rc3.d, and when combined with the black magic of devfs/sysfs/every_other_fs I just get confused to no end. I have no idea when devices are available and when/where they are called and when/where they get symlinked, etc. And therein lies the problem.

It appears the scripts were trying to initialize the RAID before my SCSI devices were even set up!

From dmesg:
Code:

st: Version 20040403, fixed bufsize 32768, s/g segs 256
i2c /dev entries driver
md: raidstart(pid 4751) used deprecated START_ARRAY ioctl. This will not be supported beyond 2.6
md: could not lock unknown-block(291,1456).
md: could not import unknown-block(291,1456)!
md: autostart unknown-block(5,74672) failed!


If I log in from where the init scripts died (namely, the checkfs script) I saw /dev/scsi was totally empty. It was like gentoo is doing things out of order, trying to set up a SCSI software RAID before it set up the SCSI devices.

Given that it was 1 AM here I didn't want to hunt down and rewrite any of the init scripts to reorder them myself. I just found a quick hack, and that is to use the /etc/modules.autoload.d/kernel-2.6 to load the sata_promise module (I am using the onboard Promise chip for my software RAID).

Again, here is where I don't understand the gentoo init procedure. Somehow sata_promise and a whole ton of other modules (USB, Firewire, etc) get loaded automagically and I only had a few entries in my modules.autoload, none of which are disk related. But by putting sata_promise in that file it forced the module to get loaded before the raidstart gets called in checkfs. That appears to set up the /dev/scsi/... devices.

BUT that isn't enough. Although /dev/scsi is populated with the various devices, for some oddball reason the symlinks to /dev/sdc1 and /dev/sdd1 did not exist during init either. My other 2 SCSI devices (which are USB flash card readers) did show up as /dev/sda1 and /dev/sdb1.

So my next step was to replace /dev/sdc1 and /dev/sdd1 in my /etc/raidtab with the actual SCSI devices. And it seems to work fine.

My current raidtab is now:
Code:

raiddev /dev/md0
     raid-level   1
     nr-raid-disks   2
     nr-spare-disks   0
     chunk-size   32
     persistent-superblock   1
#     device   /dev/sdc1
     device   /dev/scsi/host0/bus0/target0/lun0/part1
     raid-disk   0
#     device   /dev/sdd1
     device   /dev/scsi/host1/bus0/target0/lun0/part1
     raid-disk   1


Hope this helps.
Back to top
View user's profile Send private message
Donny
n00b
n00b


Joined: 06 Jun 2004
Posts: 59

PostPosted: Sun Nov 21, 2004 3:25 am    Post subject: Reply with quote

Glad you fixed it toofastforyahuh :D

I fixed mine too by loading the modules "raid0 and md " on boot.
hopes it helps someone.
Back to top
View user's profile Send private message
toofastforyahuh
Apprentice
Apprentice


Joined: 18 May 2004
Posts: 172

PostPosted: Mon Nov 22, 2004 8:27 am    Post subject: Reply with quote

Somehow not long after my last post in this thread I was able to get the /dev/sdb1 /dev/sdc1 symlinks to work. I honestly don't remember how. Probably some more udev nonsense.

Then for months my software RAID worked great--or so I thought. Except for whatever reason the second drive was not being added by the kernel at boot. I still don't know how that happened either, since both drives were added fine when the RAID was first set up.

The solution in this case was to add the missing drive again with radihotadd, and now the RAID appears to work correctly with both drives even after a reboot. (They are brand new drives and no, it was not a drive failure.)

It's amazing how complicated this seemingly simple task of setting up md is. Just emerge the raidtools (or mdadm), set up the /etc/raidtab, load the kernel modules, make the RAID, and it should just work.....but there's always some gremlin throwing a wrench into the works.
Back to top
View user's profile Send private message
blais
n00b
n00b


Joined: 30 Jul 2003
Posts: 57

PostPosted: Fri Mar 18, 2005 1:48 pm    Post subject: same problem, very different params and HW Reply with quote

hi

i have the very same same problem using a P4P800 with two 120GB IDE drives and RAID0. I won't bother repeating my logfiles, they're the same as toofastforyahuh

this is then probably not a SCSI issue.

i don't have a solution for it. when i boot from the raid (/dev/md0 as /) it stops with an error and i have / mounted in read-only mode and the maintenance prompt at the console.

could it be that I have to let the drives "resync" before mounting?
Back to top
View user's profile Send private message
blais
n00b
n00b


Joined: 30 Jul 2003
Posts: 57

PostPosted: Fri Mar 18, 2005 1:50 pm    Post subject: i meant raid-1 Reply with quote

oops i mean RAID-1 in my message above.
I used to have these in RAID-0 and it worked fine.
Now the problem started when I switched to RAID-1 (and yes, i did recreate the fs)
Back to top
View user's profile Send private message
MagicITX
n00b
n00b


Joined: 08 Feb 2005
Posts: 6

PostPosted: Sat Mar 19, 2005 10:48 pm    Post subject: Reply with quote

Can anyone help with this? My setup is different but the problem is the same. I have:

/dev/md1 RAID-1 with 2 drives
/dev/md2 RAID-1 with 2 drives
/dev/md0 RAID-0 with /dev/md1 and /dev/md2

md1 and md2 start fine but init fails at md0.

From the recovery shell I see that dmesg ends with "md: md0 stopped". If I run 'mdadm -As /dev/md0' it tells me 'mdadm: no devices found for /dev/md0'.

I can recreate the array with:

mdadm --create /dev/md0 --level=0 --raid-devices=2 /dev/md1 /dev/md2

The message I get here is a little weird. It says:

mdadm: /dev/md1 appears to be part of a raid array:
level=0 devices=2 ctime=Sat Mar 19 05:56:33 2005
mdadm: /dev/md2 appears to be part of a raid array:
level=0 devices=2 ctime=Sat Mar 19 05:56:33 2005
Continue creating array?

I answer "y"es and it says 'mdadm: array /dev/md0 started.' What makes this weird is the md1 and md2 arrays are level 1, not level 0 as reported.

At this point if I cat /proc/mdstat I get:

Personalities : [raid0][raid1]
md0 : active raid0 md2[1] md1[0]
586066944 blocks 64k chunks
md2 : active raid1 sdc1[0] sdd1[1]
293033536 blocks [2/2] [UU]
md1 : active raid1 sda1[0] sdb1[1]
293033536 blocks [2/2] [UU]

My /etc/mdadm.conf file has:

DEVICE /dev/sda1
DEVICE /dev/sdb1
DEVICE /dev/sdc1
DEVICE /dev/sdd1
ARRAY /dev/md1 devices=/dev/sda1,/dev/sdb1
ARRAY /dev/md2 devices=/dev/sdc1,/dev/sdd1
ARRAY /dev/md0 devices=/dev/md1,/dev/md2

So like the other posts in this thread, I have a RAID array that won't start during init or afterward but can be recreated from the recovery shell.

Any ideas?
_________________
Maximizing the mini-itx - magicitx.com
Back to top
View user's profile Send private message
MagicITX
n00b
n00b


Joined: 08 Feb 2005
Posts: 6

PostPosted: Sun Mar 20, 2005 7:09 pm    Post subject: Reply with quote

Some other posts suggested this could be due to udev. That isn't the problem in my case. I get the same error with devfs.
_________________
Maximizing the mini-itx - magicitx.com
Back to top
View user's profile Send private message
Phk
Guru
Guru


Joined: 02 Feb 2004
Posts: 428
Location: [undef], Lisbon, Portugal, Europe, Earth, SolarSystem, MilkyWay, 23Q Radius, Forward Time

PostPosted: Sun Mar 20, 2005 7:23 pm    Post subject: Reply with quote

MagicITX wrote:
Can anyone help with this?


I am no good help, since i'm still troubled with my own setup....

But, forget that kind of approach, try this one:
:arrow: (Which is the way i've installed my system)
and this one:
:arrow: [HOWTO] HPT, Promise, Medley, Intel, Nvidia RAID Dualboot

The second is very important, since it tells you to use the "gen2dmraid" BOOT cd which automaticly mount your raided partitions.

Good luck!
_________________
"# cat /dev/urandom >> /tmp/life"
Back to top
View user's profile Send private message
MagicITX
n00b
n00b


Joined: 08 Feb 2005
Posts: 6

PostPosted: Sun Mar 20, 2005 8:48 pm    Post subject: Reply with quote

Thanks for the feedback. My problem was a little different but I've been able to solve it.

The problem was in /etc/mdadm.conf. When mdadm starts up an array it looks through mdadm.conf for the devices to use. Previously my file contained this:

Code:
DEVICE /dev/sda1
DEVICE /dev/sdb1
DEVICE /dev/sdc1
DEVICE /dev/sdd1
ARRAY /dev/md1 devices=/dev/sda1,/dev/sdb1
ARRAY /dev/md2 devices=/dev/sdc1,/dev/sdd1
ARRAY /dev/md0 devices=/dev/md1,/dev/md2


The RAID-1 arrays md1 and md2 would start because the devices they use are defined with DEVICE. However when it got to md0 it wouldn't start because md1 and md2 are not DEVICE entries. I changed mdadm.conf to this and it now works.

Code:
DEVICE /dev/sda1
DEVICE /dev/sdb1
DEVICE /dev/sdc1
DEVICE /dev/sdd1
DEVICE /dev/md1
DEVICE /dev/md2
ARRAY /dev/md1 devices=/dev/sda1,/dev/sdb1
ARRAY /dev/md2 devices=/dev/sdc1,/dev/sdd1
ARRAY /dev/md0 devices=/dev/md1,/dev/md2


If you think about it, it makes sense. To build an array out of arrays, the middle arrays will need both ARRAY entries to defined them then DEVICE entries to use them for further arrays.
_________________
Maximizing the mini-itx - magicitx.com
Back to top
View user's profile Send private message
Phk
Guru
Guru


Joined: 02 Feb 2004
Posts: 428
Location: [undef], Lisbon, Portugal, Europe, Earth, SolarSystem, MilkyWay, 23Q Radius, Forward Time

PostPosted: Sun Mar 20, 2005 9:53 pm    Post subject: Reply with quote

Yeah, it makes sense ;)

Glad you worked it out!! I'm still in a mess....

If you want to know\help, visit my issues page..... ----> HERE

I'm posting the new problem in 30 minutes or so.

See us!
_________________
"# cat /dev/urandom >> /tmp/life"
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum