Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Raid5 unusable after reboot (help)
View unanswered posts
View posts from last 24 hours

Goto page 1, 2  Next  
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
Muddy
Tux's lil' helper
Tux's lil' helper


Joined: 02 Jan 2003
Posts: 144
Location: U.S.

PostPosted: Wed Jan 17, 2007 3:45 am    Post subject: Raid5 unusable after reboot (help) Reply with quote

Ok here goes.
I've been messing with this for two days straight now and I'm at the end of the rope.

I have 4x Sata drives on a sil3114 pci add on card that I've setup in a raid5 group on /dev/md2 using mdadm.

I've done this four of five times now and it always ends up the same, I reboot to test and make sure the raid group
is ok and the kernel (dmesg) complains the superblock is bad.

Here is the scan after this last setup.
(my other two raid groups are raid1 and part of a lvm group.
This raid5 is not)

Code:
mustang etc # mdadm --detail --scan
ARRAY /dev/md0 level=raid1 num-devices=2 UUID=1ee59ab3:7a3f2a33:65f30751:42108948
ARRAY /dev/md1 level=raid1 num-devices=2 UUID=af96a87c:e43f81a0:1446f685:c5adeb47
ARRAY /dev/md2 level=raid5 num-devices=4 UUID=eb8b16ac:cd5f09be:b0a196d0:33fa32e7


Code:

and here is the current status (pre-reboot)
mustang etc # mdadm --detail /dev/md2
/dev/md2:
        Version : 00.90.03
  Creation Time : Tue Jan 16 14:52:59 2007
     Raid Level : raid5
     Array Size : 234444288 (223.58 GiB 240.07 GB)
    Device Size : 78148096 (74.53 GiB 80.02 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 2
    Persistence : Superblock is persistent

    Update Time : Tue Jan 16 21:08:48 2007
          State : clean
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

           UUID : eb8b16ac:cd5f09be:b0a196d0:33fa32e7
         Events : 0.2

    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       1       8       17        1      active sync   /dev/sdb1
       2       8       33        2      active sync   /dev/sdc1
       3       8       49        3      active sync   /dev/sdd1


and here is my current mdadm.conf file (in /etc)

Code:

# paste this inside
DEVICE          /dev/hde*
DEVICE          /dev/hdg*
ARRAY           /dev/md0 devices=/dev/hde1,/dev/hdg1


# paste this inside
DEVICE          /dev/hdb*
DEVICE          /dev/hdc*
ARRAY           /dev/md1 devices=/dev/hdb1,/dev/hdc1

# paste this inside
DEVICE         /dev/sda*
DEVICE         /dev/sdb*
DEVICE         /dev/sdc*
DEVICE         /dev/sdd*
ARRAY          /dev/md2 devices=/dev/sda1,/dev/sdb1,/dev/sdc1,/dev/sdd1


I've double checked all four drives they are setup as fd volumes and the raidgroup has been formatted with reiserfs and I've mounted it under
/temp to verify it can read/write and it can just fine.

Also checked over my kernel but my thinking is if this can go this far, get setup and have me mount it and read/write data would not the kernel be ok?

So any ideas as to what/where to look I'm listening.

Regards,
Muddy
Back to top
View user's profile Send private message
Muddy
Tux's lil' helper
Tux's lil' helper


Joined: 02 Jan 2003
Posts: 144
Location: U.S.

PostPosted: Wed Jan 17, 2007 4:12 pm    Post subject: Reply with quote

Tried it again this time with no mdadm.conf file to see what would happen, same thing.

here is the output from dmesg..

Quote:

Linux version 2.6.18-gentoo-r6 (root@mustang) (gcc version 3.3.5-20050130 (Gentoo 3.3.5.20050130-r1, ssp-3.3.5.20050130-1, pie-8.7.7.1)) #2 SMP Wed Jan 10 01:13:37 EST 2007
(removed extra output)
libata version 2.00 loaded.
sata_sil 0000:00:13.0: version 2.0
ata1: SATA max UDMA/100 cmd 0xE0836080 ctl 0xE083608A bmdma 0xE0836000 irq 11
ata2: SATA max UDMA/100 cmd 0xE08360C0 ctl 0xE08360CA bmdma 0xE0836008 irq 11
ata3: SATA max UDMA/100 cmd 0xE0836280 ctl 0xE083628A bmdma 0xE0836200 irq 11
ata4: SATA max UDMA/100 cmd 0xE08362C0 ctl 0xE08362CA bmdma 0xE0836208 irq 11
scsi0 : sata_sil
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata1.00: ATA-6, max UDMA/133, 156301488 sectors: LBA48 NCQ (depth 0/32)
ata1.00: ata1: dev 0 multi count 16
ata1.00: configured for UDMA/100
scsi1 : sata_sil
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata2.00: ATA-6, max UDMA/133, 156301488 sectors: LBA48 NCQ (depth 0/32)
ata2.00: ata2: dev 0 multi count 16
ata2.00: configured for UDMA/100
scsi2 : sata_sil
ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata3.00: ATA-6, max UDMA/133, 156301488 sectors: LBA48 NCQ (depth 0/32)
ata3.00: ata3: dev 0 multi count 16
ata3.00: configured for UDMA/100
scsi3 : sata_sil
ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata4.00: ATA-6, max UDMA/133, 156301488 sectors: LBA48
ata4.00: ata4: dev 0 multi count 16
ata4.00: configured for UDMA/100
Vendor: ATA Model: ST380817AS Rev: 3.42
Type: Direct-Access ANSI SCSI revision: 05
Vendor: ATA Model: ST380817AS Rev: 3.42
Type: Direct-Access ANSI SCSI revision: 05
Vendor: ATA Model: ST380817AS Rev: 3.42
Type: Direct-Access ANSI SCSI revision: 05
Vendor: ATA Model: ST380013AS Rev: 3.18
Type: Direct-Access ANSI SCSI revision: 05
SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
SCSI device sda: 156301488 512-byte hdwr sectors (80026 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
sda: sda1
sd 0:0:0:0: Attached scsi disk sda
SCSI device sdb: 156301488 512-byte hdwr sectors (80026 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 3a 00 00
SCSI device sdb: drive cache: write back
SCSI device sdb: 156301488 512-byte hdwr sectors (80026 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 3a 00 00
SCSI device sdb: drive cache: write back
sdb: sdb1
sd 1:0:0:0: Attached scsi disk sdb
SCSI device sdc: 156301488 512-byte hdwr sectors (80026 MB)
sdc: Write Protect is off
sdc: Mode Sense: 00 3a 00 00
SCSI device sdc: drive cache: write back
SCSI device sdc: 156301488 512-byte hdwr sectors (80026 MB)
sdc: Write Protect is off
sdc: Mode Sense: 00 3a 00 00
SCSI device sdc: drive cache: write back
sdc: sdc1
sd 2:0:0:0: Attached scsi disk sdc
SCSI device sdd: 156301488 512-byte hdwr sectors (80026 MB)
sdd: Write Protect is off
sdd: Mode Sense: 00 3a 00 00
SCSI device sdd: drive cache: write back
SCSI device sdd: 156301488 512-byte hdwr sectors (80026 MB)
sdd: Write Protect is off
sdd: Mode Sense: 00 3a 00 00
SCSI device sdd: drive cache: write back
sdd: sdd1
sd 3:0:0:0: Attached scsi disk sdd
sd 0:0:0:0: Attached scsi generic sg0 type 0
sd 1:0:0:0: Attached scsi generic sg1 type 0
sd 2:0:0:0: Attached scsi generic sg2 type 0
sd 3:0:0:0: Attached scsi generic sg3 type 0
i2c /dev entries driver
(removed extra output)
md: raid5 personality registered for level 5
raid5: measuring checksumming speed
8regs : 454.400 MB/sec
8regs_prefetch: 405.200 MB/sec
32regs : 224.400 MB/sec
32regs_prefetch: 222.800 MB/sec
pII_mmx : 628.400 MB/sec
p5_mmx : 651.200 MB/sec
raid5: using function: p5_mmx (651.200 MB/sec)
md: multipath personality registered for level -4
md: md driver 0.90.3 MAX_MD_DEVS=256, MD_SB_DISKS=27
md: bitmap version 4.39
device-mapper: ioctl: 4.7.0-ioctl (2006-06-24) initialised: dm-devel@redhat.com
md: Autodetecting RAID arrays.
Time: tsc clocksource has been installed.
md: invalid superblock checksum on sdb1
md: sdb1 has invalid sb, not importing!
md: invalid superblock checksum on sdd1
md: sdd1 has invalid sb, not importing!
md: autorun ...
md: considering sdc1 ...
md: adding sdc1 ...
md: hdg1 has different UUID to sdc1
md: hde1 has different UUID to sdc1
md: hdc1 has different UUID to sdc1
md: hdb1 has different UUID to sdc1
md: created md2
md: bind<sdc1>
md: running: <sdc1>
raid5: device sdc1 operational as raid disk 2
raid5: not enough operational devices for md2 (3/4 failed)
RAID5 conf printout:
--- rd:4 wd:1 fd:3
disk 2, o:1, dev:sdc1
raid5: failed to run raid set md2
md: pers->run() failed ...
md: do_md_run() returned -5
md: md2 stopped.
md: unbind<sdc1>
md: export_rdev(sdc1)
md: considering hdg1 ...
md: adding hdg1 ...
md: adding hde1 ...
md: hdc1 has different UUID to hdg1
md: hdb1 has different UUID to hdg1
md: created md0
md: bind<hde1>
md: bind<hdg1>
md: running: <hdg1><hde1>
raid1: raid set md0 active with 2 out of 2 mirrors
md: considering hdc1 ...
md: adding hdc1 ...
md: adding hdb1 ...
md: created md1
md: bind<hdb1>
md: bind<hdc1>
md: running: <hdc1><hdb1>
raid1: raid set md1 active with 2 out of 2 mirrors
md: ... autorun DONE.
ReiserFS: hda3: found reiserfs format "3.6" with standard journal
ReiserFS: hda3: using ordered data mode
ReiserFS: hda3: journal params: device hda3, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
ReiserFS: hda3: checking transaction log (hda3)
ReiserFS: hda3: Using r5 hash to sort names
VFS: Mounted root (reiserfs filesystem) readonly.
Freeing unused kernel memory: 200k freed
Adding 514072k swap on /dev/hda2. Priority:-1 extents:1 across:514072k
ReiserFS: hdd1: found reiserfs format "3.6" with standard journal
ReiserFS: hdd1: using ordered data mode
ReiserFS: hdd1: journal params: device hdd1, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
ReiserFS: hdd1: checking transaction log (hdd1)
ReiserFS: hdd1: Using r5 hash to sort names
ReiserFS: dm-0: found reiserfs format "3.6" with standard journal
ReiserFS: dm-0: using ordered data mode
ReiserFS: dm-0: journal params: device dm-0, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
ReiserFS: dm-0: checking transaction log (dm-0)
ReiserFS: dm-0: Using r5 hash to sort names
ReiserFS: dm-1: found reiserfs format "3.6" with standard journal
ReiserFS: dm-1: using ordered data mode
ReiserFS: dm-1: journal params: device dm-1, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
ReiserFS: dm-1: checking transaction log (dm-1)
ReiserFS: dm-1: Using r5 hash to sort names
ReiserFS: sda1: found reiserfs format "3.6" with standard journal
ReiserFS: sda1: using ordered data mode
ReiserFS: sda1: journal params: device sda1, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
ReiserFS: sda1: checking transaction log (sda1)
ReiserFS: sda1: Using r5 hash to sort names
eth0: Setting full-duplex based on MII#1 link partner capability of c5e1.
ReiserFS: warning: is_tree_node: node level 0 does not match to the expected one 1
ReiserFS: sda1: warning: vs-5150: search_by_key: invalid format found in block 8211. Fsck?
ReiserFS: warning: is_tree_node: node level 0 does not match to the expected one 1
ReiserFS: sda1: warning: vs-5150: search_by_key: invalid format found in block 8211. Fsck?


The more I read over the log/dmesg files I'm thinking if maybe I can tell mdadm to NOT autodetect the raid arrays and just read the config file that would work.
Back to top
View user's profile Send private message
user123
n00b
n00b


Joined: 28 Jun 2005
Posts: 14

PostPosted: Wed Jan 17, 2007 5:40 pm    Post subject: Reply with quote

Try to assemble the array manually instead.

Stop any running arrays using mdadm -S /dev/md0 (for example).

Use mdadm -E /dev/hda1 (also an example) to examine the md-superblock on /dev/hda1.
Do this for every partition you know is part of an array and figure which devices belong to which array.

Say you've found out /dev/hda1, /dev/hdc1, /dev/hde1 are part of your raid 5 array.
Then just assemble it using mdadm --assemble /dev/md0 /dev/hda1 /dev/hdc1 /dev/hde1.
Back to top
View user's profile Send private message
Muddy
Tux's lil' helper
Tux's lil' helper


Joined: 02 Jan 2003
Posts: 144
Location: U.S.

PostPosted: Wed Jan 17, 2007 6:53 pm    Post subject: Reply with quote

did that all looks good, however here is something odd.. ran monitor for the new raid5 array and check this out.

Code:
 # mdadm --monitor /dev/md2
Jan 17 13:49:19: SparesMissing on /dev/md2 unknown device
[/quote]

I don't have spares configured.

Code:
# mdadm --detail /dev/md2
/dev/md2:
        Version : 00.90.03
  Creation Time : Wed Jan 17 11:40:03 2007
     Raid Level : raid5
     Array Size : 234444288 (223.58 GiB 240.07 GB)
    Device Size : 78148096 (74.53 GiB 80.02 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 2
    Persistence : Superblock is persistent

    Update Time : Wed Jan 17 13:39:36 2007
          State : clean
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

           UUID : f43616a6:b3a2de12:713ef095:8b48b649
         Events : 0.6

    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       1       8       17        1      active sync   /dev/sdb1
       2       8       33        2      active sync   /dev/sdc1
       3       8       49        3      active sync   /dev/sdd1


any ideas what that message is referring to?
Back to top
View user's profile Send private message
user123
n00b
n00b


Joined: 28 Jun 2005
Posts: 14

PostPosted: Wed Jan 17, 2007 7:55 pm    Post subject: Reply with quote

It might be something in /etc/mdadm.conf that confuses mdadm --monitor.
Just a wild guess though.
Back to top
View user's profile Send private message
Muddy
Tux's lil' helper
Tux's lil' helper


Joined: 02 Jan 2003
Posts: 144
Location: U.S.

PostPosted: Fri Jan 19, 2007 2:50 pm    Post subject: Reply with quote

no, I'm at a loss at this point.
I've even tried to create the raidgroup using the new metadata specs 1.0, 1.1 and 1.2.
For whatever reason I can't create the raidgroup at all unless I use the default 0.90.
Then no matter what I try upon reboot the raidgroup fails to creaate with bad superblock in dmesg.

here is the current config file

Code:
# cat /etc/mdadm.conf
ARRAY /dev/md0 level=raid1 num-devices=2 UUID=1ee59ab3:7a3f2a33:65f30751:42108948
ARRAY /dev/md1 level=raid1 num-devices=2 UUID=af96a87c:e43f81a0:1446f685:c5adeb47
ARRAY /dev/md2 level=raid5 num-devices=4 metadata=0.90 UUID=b7baefd6:8ae1bd7f:8212f91b:66a91381
Back to top
View user's profile Send private message
Muddy
Tux's lil' helper
Tux's lil' helper


Joined: 02 Jan 2003
Posts: 144
Location: U.S.

PostPosted: Mon Jan 22, 2007 7:34 pm    Post subject: Reply with quote

Still can't get this working.

Just for kicks made it raid0 and it worked fine.

Tried Raid6 and it failed on reboot as well.


Any ideas on if it's something in the kernel?
Back to top
View user's profile Send private message
user123
n00b
n00b


Joined: 28 Jun 2005
Posts: 14

PostPosted: Mon Jan 22, 2007 9:48 pm    Post subject: Reply with quote

Try upgrading to 2.6.19.2.
There's a data corruption bug in dm/dm-crypt with canceled BIOs, if i recall correctly, in earlier versions which was fixed in 2.6.19.

The fix was backported to 2.6.18.6 aswell.
Back to top
View user's profile Send private message
Muddy
Tux's lil' helper
Tux's lil' helper


Joined: 02 Jan 2003
Posts: 144
Location: U.S.

PostPosted: Tue Jan 23, 2007 5:10 pm    Post subject: Reply with quote

went to 2.6.19-gentoo-r2 and created the raid5 group again, formatted it with reiserfs, updated the mdadm.conf file and rebooted.

with this.. again.

Code:
md: invalid superblock checksum on sda1
md: sda1 has invalid sb, not importing!
md: invalid superblock checksum on sdb1
md: sdb1 has invalid sb, not importing!
md: invalid superblock checksum on sdc1
md: sdc1 has invalid sb, not importing!
md: invalid superblock checksum on sdd1
md: sdd1 has invalid sb, not importing!md: md2 stopped.
md: invalid superblock checksum on sdb1
md: sdb1 has invalid sb, not importing!
md: md_import_device returned -22
md: invalid superblock checksum on sdc1
md: sdc1 has invalid sb, not importing!
md: md_import_device returned -22
md: invalid superblock checksum on sdd1
md: sdd1 has invalid sb, not importing!
md: md_import_device returned -22
md: invalid superblock checksum on sda1
md: sda1 has invalid sb, not importing!
md: md_import_device returned -22

Code:

ReiserFS: md2: warning: sh-2006: read_super_block: bread failed (dev md2, block 2, size 4096)
ReiserFS: md2: warning: sh-2006: read_super_block: bread failed (dev md2, block 16, size 4096)
ReiserFS: md2: warning: sh-2021: reiserfs_fill_super: can not find reiserfs on md2




*sigh*

making it again.. but this time I will stop it and restart it before rebooting to see if it comes back together.


Code:
 mdadm --detail /dev/md2                                                                                                 Tue Jan 23 12:59:50 2007

/dev/md2:
        Version : 00.90.03
  Creation Time : Tue Jan 23 12:53:07 2007
     Raid Level : raid5
     Array Size : 234444288 (223.58 GiB 240.07 GB)
    Device Size : 78148096 (74.53 GiB 80.02 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 2
    Persistence : Superblock is persistent

    Update Time : Tue Jan 23 12:53:07 2007
          State : clean, degraded, recovering
 Active Devices : 3
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 1

         Layout : left-symmetric
     Chunk Size : 64K

 Rebuild Status : 6% complete

           UUID : 2db141f2:e2acc70c:8b2a3fed:48ee2ef6
         Events : 0.1

    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       1       8       17        1      active sync   /dev/sdb1
       2       8       33        2      active sync   /dev/sdc1
       4       8       49        3      spare rebuilding   /dev/sdd1


Last edited by Muddy on Tue Jan 23, 2007 5:47 pm; edited 1 time in total
Back to top
View user's profile Send private message
user123
n00b
n00b


Joined: 28 Jun 2005
Posts: 14

PostPosted: Tue Jan 23, 2007 5:39 pm    Post subject: Reply with quote

1) Does it work if you skip the whole mdadm.conf part and assemble it manually on boot?

2) Are you sure you don't have a hardware error?
Faulty disks, faulty memory, a faulty or not fully plugged in PCI IDE controller or something?

You can test memory by emerging memtest86 and add it to grub.conf and then boot it using grub.
Back to top
View user's profile Send private message
Muddy
Tux's lil' helper
Tux's lil' helper


Joined: 02 Jan 2003
Posts: 144
Location: U.S.

PostPosted: Tue Jan 23, 2007 5:51 pm    Post subject: Reply with quote

Like I said I have 2x raid1 groups using lvm and it's flawless.
Only thing I can think is maybe the sil3114 pci sata card or one of the drives are goofed... however I've run the dd zero disk on them so many times now without error that I'd guess that it would of show itself by now.
Back to top
View user's profile Send private message
user123
n00b
n00b


Joined: 28 Jun 2005
Posts: 14

PostPosted: Tue Jan 23, 2007 5:58 pm    Post subject: Reply with quote

Weird.. and you created your LVM volume like this (assuming md2 is your raid5 array)?
pvcreate /dev/md2
vgcreate vgraid5 /dev/md2
..

..and not:
pvcreate /dev/hda2 (assuming /dev/hda2 is part of the raid5 array "md2")
pvcreate /dev/hdc2 (assuming /dev/hdc2 is part of the raid5 array "md2")
pvcreate /dev/hde2 (assuming /dev/hde2 is part of the raid5 array "md2")

Because in the later case I could understand your corruption problems..
Back to top
View user's profile Send private message
Muddy
Tux's lil' helper
Tux's lil' helper


Joined: 02 Jan 2003
Posts: 144
Location: U.S.

PostPosted: Tue Jan 23, 2007 6:00 pm    Post subject: Reply with quote

well the lvm2 groups are on md0 and m1 putting them together.
Aside from that yea, similar to what you typed in.
md2 the raid5 group that is killing me is/was going to be a stand alone volume from the lvm2 stuff.


The more I think about it the more I'm thinking udev is screwing me.
The UUID for the raid5 group changes each time I make it, should that be happening?
Back to top
View user's profile Send private message
user123
n00b
n00b


Joined: 28 Jun 2005
Posts: 14

PostPosted: Tue Jan 23, 2007 6:22 pm    Post subject: Reply with quote

Hmm.. Very odd. :)

What if you do something like,

# Zap old superblocks to minimize the risk of md getting confused
for part in sda1 sdb1 sdc1 sdd1; do mdadm --zero-superblock /dev/$part ; done

# Zap partitions and re-read partition tables
for drive in sda sdb sdc sdd; do
dd if=/dev/zero of=/dev/$drive count=1
blockdev --rereadpt /dev/$drive
done

# Try to create array, this time not using partitions, and storing the superblock 4K into the drive instead (-e 1.2) -- just to make sure the problem doesn't have to do with a read/write problem at the end of the disk or somethign
mdadm --create /dev/md2 -l5 -n4 -e 1.2 /dev/sdb /dev/sda /dev/sdd /dev/sdc

# Stop the array
mdadm -S /dev/md2

# Try assembling it again
mdadm -A /dev/md2 /dev/sda /dev/sdb /dev/sdc /dev/sdd

# Create reiserfs on /dev/md2
mkreiserfs -q /dev/md2

# Stop the array
mdadm -S /dev/md2

# Reboot and see if the filesystem is still there after manually reassembling the device
reboot
# ..and then when the system comes up again, mdadm -A /dev/md2 /dev/sda /dev/sdb /dev/sdc /dev/sdd)

Don't forget to remove the corresponding lines from /etc/mdadm.conf
Back to top
View user's profile Send private message
user123
n00b
n00b


Joined: 28 Jun 2005
Posts: 14

PostPosted: Tue Jan 23, 2007 6:32 pm    Post subject: Reply with quote

Also..

I think you said it never worked when you rebooted.
But it worked if you manually stopped the array and re-assembled it again.

Doesn't that indicate that there's something in either the shutdown- or boot-scripts that mess with those devices?

How about trying to flush the buffers with the sync command, wait a few secs and then do a hard poweroff, boot into single-user mode, then try to manually reassemble the array and see if it gives any errors.

At least that would rule out the shutdown- and init-scripts.
Back to top
View user's profile Send private message
Muddy
Tux's lil' helper
Tux's lil' helper


Joined: 02 Jan 2003
Posts: 144
Location: U.S.

PostPosted: Tue Jan 23, 2007 6:48 pm    Post subject: Reply with quote

I've not been able to re-assemble after a boot.
Waiting on it to finish the current sync, then trying some of what you said.
Will update when I know more.
Back to top
View user's profile Send private message
Muddy
Tux's lil' helper
Tux's lil' helper


Joined: 02 Jan 2003
Posts: 144
Location: U.S.

PostPosted: Tue Jan 23, 2007 8:13 pm    Post subject: Reply with quote

lmao, oh man I've got some screwed up stuff.

finished building the array on /dev/sd[abcd]1 and then ran mdadm -S /dev/md2

then ran the mdadm -A /dev/md2 and it failed, but for kicks I just arrowed up and hit enter a bunch of times, then the freak show started.
every few times it would actually build with only one drive (random), then once it built with two... but no more.

I'm at a total loss.

8O
Back to top
View user's profile Send private message
Muddy
Tux's lil' helper
Tux's lil' helper


Joined: 02 Jan 2003
Posts: 144
Location: U.S.

PostPosted: Wed Jan 24, 2007 12:32 am    Post subject: Reply with quote

cleaned all four drives with no partitions at all, ended up having to use the --force option on creation


Code:


mdadm --create --force --verbose -e 1.2 /dev/md2 --level=5 --raid-devices=4 /dev/sda /dev/sdb /dev/sdc /dev/sdd


but it's building

Code:
# mdadm --detail /dev/md2
/dev/md2:
        Version : 01.02.03
  Creation Time : Tue Jan 23 19:44:01 2007
     Raid Level : raid5
     Array Size : 234451968 (223.59 GiB 240.08 GB)
    Device Size : 156301312 (74.53 GiB 80.03 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 2
    Persistence : Superblock is persistent

    Update Time : Tue Jan 23 19:44:01 2007
          State : clean, resyncing
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

 Rebuild Status : 0% complete

           Name : 2
           UUID : 64ca03e9:56bd6c9e:1b2cf477:9fe18e19
         Events : 0

    Number   Major   Minor   RaidDevice State
       0       8        0        0      active sync   /dev/sda
       1       8       16        1      active sync   /dev/sdb
       2       8       32        2      active sync   /dev/sdc
       3       8       48        3      active sync   /dev/sdd
Back to top
View user's profile Send private message
Muddy
Tux's lil' helper
Tux's lil' helper


Joined: 02 Jan 2003
Posts: 144
Location: U.S.

PostPosted: Wed Jan 24, 2007 2:53 am    Post subject: Reply with quote

no go same crap

here is after the sync before I did the mdadm -S on /dev/md2

Code:
# mdadm --detail /dev/md2
/dev/md2:
        Version : 01.02.03
  Creation Time : Tue Jan 23 19:44:01 2007
     Raid Level : raid5
     Array Size : 234451968 (223.59 GiB 240.08 GB)
    Device Size : 156301312 (74.53 GiB 80.03 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 2
    Persistence : Superblock is persistent

    Update Time : Tue Jan 23 21:50:38 2007
          State : clean
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

           Name : 2
           UUID : 64ca03e9:56bd6c9e:1b2cf477:9fe18e19
         Events : 2

    Number   Major   Minor   RaidDevice State
       0       8        0        0      active sync   /dev/sda
       1       8       16        1      active sync   /dev/sdb
       2       8       32        2      active sync   /dev/sdc
       3       8       48        3      active sync   /dev/sdd
mustang ~ # cat /proc/mdstat
Personalities : [raid0] [raid1] [raid6] [raid5] [raid4]
md2 : active raid5 sdd[3] sdc[2] sdb[1] sda[0]
      234451968 blocks super 1.2 level 5, 64k chunk, algorithm 2 [4/4] [UUUU]
     
md1 : active raid1 hdc1[1] hdb1[0]
      30018112 blocks [2/2] [UU]
     
md0 : active raid1 hdg1[1] hde1[0]
      117218176 blocks [2/2] [UU]
     
unused devices: <none>


and the output from /var/log/messages

Code:

Jan 23 19:44:01 mustang RAID5 conf printout:
Jan 23 19:44:01 mustang --- rd:4 wd:4
Jan 23 19:44:01 mustang disk 0, o:1, dev:sda
Jan 23 19:44:01 mustang disk 1, o:1, dev:sdb
Jan 23 19:44:01 mustang disk 2, o:1, dev:sdc
Jan 23 19:44:01 mustang disk 3, o:1, dev:sdd
Jan 23 19:44:01 mustang md: resync of RAID array md2
Jan 23 19:44:01 mustang md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
Jan 23 19:44:01 mustang md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for resync.
Jan 23 19:44:01 mustang md: using 128k window, over a total of 78150656 blocks.
Jan 23 19:44:01 mustang mdadm: NewArray event detected on md device /dev/md2
Jan 23 20:10:02 mustang mdadm: Rebuild20 event detected on md device /dev/md2
Jan 23 20:35:03 mustang mdadm: Rebuild40 event detected on md device /dev/md2
Jan 23 21:00:04 mustang mdadm: Rebuild60 event detected on md device /dev/md2
Jan 23 21:26:04 mustang mdadm: Rebuild80 event detected on md device /dev/md2
Jan 23 21:50:38 mustang md: md2: resync done.
Jan 23 21:50:38 mustang RAID5 conf printout:
Jan 23 21:50:38 mustang --- rd:4 wd:4
Jan 23 21:50:38 mustang disk 0, o:1, dev:sda
Jan 23 21:50:38 mustang disk 1, o:1, dev:sdb
Jan 23 21:50:38 mustang disk 2, o:1, dev:sdc
Jan 23 21:50:38 mustang disk 3, o:1, dev:sdd
Jan 23 21:50:38 mustang mdadm: RebuildFinished event detected on md device /dev/md2
mustang ~ #



Then I did the mdadm -S command

Code:
 # mdadm -S /dev/md2
mdadm: stopped /dev/md2


and tried to restart

Code:
 mdadm -A /dev/md2 /dev/sda /dev/sdb /dev/sdc /dev/sdd
mdadm: failed to add /dev/sdb to /dev/md2: Invalid argument
mdadm: failed to add /dev/sdc to /dev/md2: Invalid argument
mdadm: failed to add /dev/sdd to /dev/md2: Invalid argument
mdadm: failed to add /dev/sda to /dev/md2: Invalid argument
mdadm: /dev/md2 assembled from 0 drives - not enough to start the array.


from the log when I tried the restart

Code:
Jan 23 22:00:37 mustang md: md2 stopped.
Jan 23 22:00:37 mustang md: unbind<sdd>
Jan 23 22:00:37 mustang md: export_rdev(sdd)
Jan 23 22:00:37 mustang md: unbind<sdc>
Jan 23 22:00:37 mustang md: export_rdev(sdc)
Jan 23 22:00:37 mustang md: unbind<sdb>
Jan 23 22:00:37 mustang md: export_rdev(sdb)
Jan 23 22:00:37 mustang md: unbind<sda>
Jan 23 22:00:37 mustang md: export_rdev(sda)
Jan 23 22:00:37 mustang mdadm: DeviceDisappeared event detected on md device /dev/md2
Jan 23 22:00:56 mustang md: md2 stopped.
Jan 23 22:00:57 mustang md: invalid superblock checksum on sdb
Jan 23 22:00:57 mustang md: sdb has invalid sb, not importing!
Jan 23 22:00:57 mustang md: md_import_device returned -22
Jan 23 22:00:57 mustang md: invalid superblock checksum on sdc
Jan 23 22:00:57 mustang md: sdc has invalid sb, not importing!
Jan 23 22:00:57 mustang md: md_import_device returned -22
Jan 23 22:00:57 mustang md: invalid superblock checksum on sdd
Jan 23 22:00:57 mustang md: sdd has invalid sb, not importing!
Jan 23 22:00:57 mustang md: md_import_device returned -22
Jan 23 22:00:57 mustang md: invalid superblock checksum on sda
Jan 23 22:00:57 mustang md: sda has invalid sb, not importing!
Jan 23 22:00:57 mustang md: md_import_device returned -22
Jan 23 22:01:28 mustang md: md2 stopped.


any ideas?
Back to top
View user's profile Send private message
drescherjm
Advocate
Advocate


Joined: 05 Jun 2004
Posts: 2790
Location: Pittsburgh, PA, USA

PostPosted: Wed Jan 24, 2007 6:15 am    Post subject: Reply with quote

Do you have a recent version of mdadm installed?
_________________
John

My gentoo overlay
Instructons for overlay
Back to top
View user's profile Send private message
Muddy
Tux's lil' helper
Tux's lil' helper


Joined: 02 Jan 2003
Posts: 144
Location: U.S.

PostPosted: Wed Jan 24, 2007 3:43 pm    Post subject: Reply with quote

drescherjm wrote:
Do you have a recent version of mdadm installed?


mdadm-2.5.2
Back to top
View user's profile Send private message
drescherjm
Advocate
Advocate


Joined: 05 Jun 2004
Posts: 2790
Location: Pittsburgh, PA, USA

PostPosted: Wed Jan 24, 2007 6:03 pm    Post subject: Reply with quote

I thought that might be the problem as a lot of updates/changes have occurred to linux software raid in the last couple of kernels.

At home I am using that same version of mdadm with a 2.6.18 kernel but that it is at raid1 and at work we have many systems using a few different kernels and software raid 1, 5 and 6 but most of the systems still have sys-fs/mdadm-1.12.0 ( I just checked) which is no longer in portage and I assume there is no support for raid5 reshaping.
_________________
John

My gentoo overlay
Instructons for overlay
Back to top
View user's profile Send private message
Muddy
Tux's lil' helper
Tux's lil' helper


Joined: 02 Jan 2003
Posts: 144
Location: U.S.

PostPosted: Wed Jan 24, 2007 6:10 pm    Post subject: Reply with quote

yea, I've been pulling what's left of my hair out for a bit now.
My best friend suggested genkernel, so giving that a shot to see if it helps.
Back to top
View user's profile Send private message
Muddy
Tux's lil' helper
Tux's lil' helper


Joined: 02 Jan 2003
Posts: 144
Location: U.S.

PostPosted: Mon Jan 29, 2007 9:35 pm    Post subject: Reply with quote

genkernel did not do anything different, will do some other (outside Gentoo) trials and report back.
Back to top
View user's profile Send private message
Laitr Keiows
Bodhisattva
Bodhisattva


Joined: 04 Jul 2005
Posts: 891
Location: Kobe, Japan

PostPosted: Tue Jan 30, 2007 8:13 pm    Post subject: Reply with quote

Muddy wrote:
and tried to restart

Code:
 mdadm -A /dev/md2 /dev/sda /dev/sdb /dev/sdc /dev/sdd

any ideas?

What if you try to create them again? with --create
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum