View previous topic :: View next topic |
Author |
Message |
lutel Tux's lil' helper
Joined: 19 Oct 2003 Posts: 110 Location: Pomroczna
|
Posted: Sun Jun 15, 2014 6:15 pm Post subject: trying to setup software RAID1 makes me crazy |
|
|
I've been convinced to use software RAID instead of fakeraid, and I lost hours trying to boot machine from software RAID (both with lilo/grub2), here is my setup:
Code: |
# cat /etc/fstab
/dev/md0 /boot ext4 noauto,noatime 1 2
/dev/md1 / ext4 noatime 0 1
LABEL=var /var ext4 noatime 0 1
LABEL=lib /var/lib ext4 noatime 0 1
LABEL=log /var/log ext4 noatime 0 1
LABEL=data /w ext4 noatime 0 1
/dev/cdrom /mnt/cdrom auto noauto,ro 0 0
/dev/fd0 /mnt/floppy auto noauto 0 0
# parted /dev/sda (same for sdb, sdc)
GNU Parted 3.1
Using /dev/sda
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) p
Model: ATA WDC WD2000FYYZ-0 (scsi)
Disk /dev/sda: 2000GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:
Number Start End Size File system Name Flags
1 1049kB 67.1MB 66.1MB ext4 boot bios_grub
2 67.1MB 15.7GB 15.7GB root raid
3 15.7GB 26.2GB 10.5GB tmp raid
4 26.2GB 36.7GB 10.5GB var raid
5 36.7GB 68.2GB 31.5GB lib raid
6 68.2GB 78.6GB 10.5GB log raid
7 78.6GB 2000GB 1922GB data raid
#mdadm --detail --scan
ARRAY /dev/md5 metadata=1.2 spares=1 name=livecd:5 UUID=98c62d6f:e552d93c:83f3fe37:63ca1585
ARRAY /dev/md4 metadata=1.2 spares=1 name=livecd:4 UUID=3a851e88:66872e12:d298bd0c:f2890ff7
ARRAY /dev/md3 metadata=1.2 spares=1 name=livecd:3 UUID=b1c712f8:f6aad7a9:342482f2:1fb24ff6
ARRAY /dev/md2 metadata=1.2 spares=1 name=livecd:2 UUID=fee31ef1:42726f23:514232bc:9c4bc4d1
ARRAY /dev/md1 metadata=1.2 spares=1 name=livecd:1 UUID=a724e247:4fff2180:57cd302f:42d0b052
ARRAY /dev/md0 metadata=0.90 spares=1 UUID=25a0a1e0:188a36da:cb201669:f728008a
#cat /etc/lilo.conf
compact
lba32
boot = /dev/md0
raid-extra-boot=mbr-only
menu-scheme = Wb
prompt
timeout = 50
delay = 0
root = /dev/md1
image = /boot/kernel-genkernel-x86_64-3.14.5-hardened-r2
label = 1
append="rootdelay=3,domdadm,real_root=a724e247-4fff2180-57cd302f-42d0b052"
initrd=/boot/initramfs-genkernel-x86_64-3.14.5-hardened-r2
image = /boot/kernel-genkernel-x86_64-3.14.5-hardened-r2
label = 2
initrd=/boot/initramfs-genkernel-x86_64-3.14.5-hardened-r2 |
after boot from live dvd, i always have to rebuild md0, as this is the array status i see (/dev/md0 is split into two arrays), below is output before fix:
Code: | md126 : active raid1 sda1[0]
64448 blocks [2/1] [U_]
md127 : active raid1 sdb1[1] sdc1[0]
64448 blocks [2/2] [UU]
md1 : active raid1 sda2[0] sdc2[2](S) sdb2[1]
15286144 blocks super 1.2 [2/2] [UU]
md2 : active raid1 sda3[0] sdc3[2](S) sdb3[1]
10231680 blocks super 1.2 [2/2] [UU]
md3 : active raid1 sda4[0] sdc4[2](S) sdb4[1]
10231680 blocks super 1.2 [2/2] [UU]
md4 : active raid1 sda5[0] sdc5[2](S) sdb5[1]
30703488 blocks super 1.2 [2/2] [UU]
md5 : active raid1 sda6[0] sdc6[2](S) sdb6[1]
10231680 blocks super 1.2 [2/2] [UU]
cat .config | egrep "(_MD_|AHCI|TMPFS|EXT4)"
CONFIG_DEVTMPFS=y
# CONFIG_DEVTMPFS_MOUNT is not set
CONFIG_SATA_AHCI=y
# CONFIG_SATA_AHCI_PLATFORM is not set
# CONFIG_SATA_ACARD_AHCI is not set
CONFIG_MD_AUTODETECT=y
# CONFIG_MD_LINEAR is not set
# CONFIG_MD_RAID0 is not set
CONFIG_MD_RAID1=y
CONFIG_MD_RAID10=y
CONFIG_MD_RAID456=y
CONFIG_MD_MULTIPATH=m
CONFIG_MD_FAULTY=m
CONFIG_EXT4_FS=y
CONFIG_EXT4_FS_POSIX_ACL=y
CONFIG_EXT4_FS_SECURITY=y
# CONFIG_EXT4_DEBUG is not set
CONFIG_TMPFS=y
CONFIG_TMPFS_POSIX_ACL=y
CONFIG_TMPFS_XATTR=y
|
Here is my "fix" to get /dev/md0 back (although empty)
Code: | mdadm --stop /dev/md126
mdadm --stop /dev/md127
mdadm --create -e 0.9 /dev/md0 --level=1 --raid-devices=2 /dev/sda1 /dev/sdb1 --spare-devices=1 /dev/sdc1
mkfs.ext4 -L boot -m 0 -O ^huge_file /dev/md0
mdadm --detail --scan > /mnt/gentoo/etc/mdadm.conf
|
Genkernel... Eventualy i'd like to have just kernel, without initrd, and compile-in all drivers nessecary. Tried couple of times - always to kernel panic, system is unable to find root.
genkernel --mdadm ----mdadm-config=/etc/mdadm.conf --install all
Tried grub2, but machine is rebooting constantly with no messages visible. After booting with lilo, system is not able to find root. When I enter shell in initrd, there are no /dev/md... entries. Here is error message I get:
Code: | Determining root device...
!! Block device 901 is not a valid root device...
!! Could not find the root block device in . |
Please help, I'm out of ideas what am i doing wrong, tried grub2, lilo, I still can't boot. |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54300 Location: 56N 3W
|
Posted: Sun Jun 15, 2014 8:33 pm Post subject: |
|
|
lutel,
Code: | Could not find the root block device in . | Should contain a list of block devices the kernel can see.
The list is empty, this probably means your low level dchipset dpiver is set to =M or off.
Post the output of lspci and put your kernel .config onto a pastebin please. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
lutel Tux's lil' helper
Joined: 19 Oct 2003 Posts: 110 Location: Pomroczna
|
Posted: Mon Jun 16, 2014 6:24 am Post subject: |
|
|
NeddySeagoon,
Thanks for pointing this to this, here is my lspci and .config http://pastebin.com/8Pafaarg
I dont know which driver should I compile in for this platform, I though that with initrd it would work even if its module.
But the best would be if it can be set without initrd at all. |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54300 Location: 56N 3W
|
Posted: Mon Jun 16, 2014 4:07 pm Post subject: |
|
|
lutel,
An initrd is just a cut down root filesystem in a file.
The can work with everything as modules ... it all depends on how the initrd is made.
I roll my own initrd, then I have a chance to fix them when they don't work.
Your kernel looks like it should have got further than it did without an initrd to help it along.
Code: | # CONFIG_SATA_AHCI_PLATFORM is not set | May be needed.
I have two different motherboards with identical AMD chip sets. One needs that option, one does not.
Code: | # CONFIG_DEVTMPFS_MOUNT is not set | may be needed too as it allows the kernel to populate the /dev filesystem with no outside help.
How did you configure your kernel?
Its not a genkernel kernel.
I have not looked at your raid and filesystem settings in any detail but you have Code: | CONFIG_MD_AUTODETECT=y |
Be warned that that only works with raid sets that have metadata version 0.90. The default for mdadm is 1.2. If you told mdadm to use 0.90 metadata, kernel auto assembly will work for you.
If not, you need an initrd to assemble your raid if you have root on raid. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
lutel Tux's lil' helper
Joined: 19 Oct 2003 Posts: 110 Location: Pomroczna
|
Posted: Mon Jun 16, 2014 7:35 pm Post subject: |
|
|
Hi NeddySeagoon,
Thanks, indeed I have 0.90 for /boot and 1.2 for / - so as you said only option to assemble is to go initrd. So what I did, i've run
genkernel --mdadm --mdadm-config=/etc/mdadm.conf --install --menuconfig all
added CONFIG_SATA_AHCI_PLATFORM and CONFIG_DEVTMPFS_MOUNT (my new .config http://pastebin.com/T6QVHsgK), then lilo, tried to boot, still the same...
But, when i enter shell of initramfs, i see only /dev/sdx0-6, no /dev/md.... - it seems that low level drivers correctly detects my disks, but raid is not build by mdadm.
As you said, 0.90 is required to automount - if I use 0.90 metadata for all partiotns, will it be automounted and initrd wont be required? If so, whats the disadvantage of having 0.90 instead of 1.2? |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54300 Location: 56N 3W
|
Posted: Mon Jun 16, 2014 7:59 pm Post subject: |
|
|
lutel,
Raid 0.90 metadata only supports up to 28 drives. Thats not a problem for most people.
Swapping from 1.2 to 0.90 metadata will destroy the data on the raid but with care it can be done.
I'm not a genkernel user, so I can't help there.
You say you got the same output but if /dev listed your block devices the error Code: | !! Could not find the root block device in . | should also list all of your block devices too. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
lutel Tux's lil' helper
Joined: 19 Oct 2003 Posts: 110 Location: Pomroczna
|
Posted: Mon Jun 16, 2014 8:18 pm Post subject: |
|
|
I'm getting closer i think, i still get error "!! Could not find the root block device in ."
But then i can enter shell, and then, when i mdadm -As - the array is built correctly!
So the drivers are there, but still initrd is not building the array - don't know why... |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54300 Location: 56N 3W
|
Posted: Mon Jun 16, 2014 8:31 pm Post subject: |
|
|
lutel,
In the rescue shell, you should be able to do and see the init script that the initrd tries to run.
If you execute the commands from the mdadm -A to the end of the script, in sequence, your system should boot.
I far prefer a hand rolled initrd to a script built one that I don't understand and can't fix. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
lutel Tux's lil' helper
Joined: 19 Oct 2003 Posts: 110 Location: Pomroczna
|
Posted: Mon Jun 16, 2014 8:43 pm Post subject: |
|
|
Finally i found out what i did wrong, i was passing wrong parameters to kernel...
instead of
Code: | append="domdadm real_root=/dev/md1" |
i've been using
Code: | append="domdadm,real_root=/dev/md1" |
so stupid mistake...
But anyway, i'll try now to reinstall everything on 0.9, i'd like to get rid of initrd.
Thank you for your help so far. |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54300 Location: 56N 3W
|
Posted: Mon Jun 16, 2014 8:57 pm Post subject: |
|
|
lutel,
Don't reinstall. Move the install over.
Do this from a liveCD
You have a working raid1
Fail one partition of /, so your raid is degraded.
Remove the failed partition from the raid set.
Using the failed partition, make a new raid1 set with a missing drive (metadata 0.90)
Make a filesystem on this raid set.
You now nave two degraded raid sets. One with your install on it one empty.
Copy your install over.
Boot the new degraded raid.
Add the partition from the orignial install to the new raid set.
Look in /proc/mdstat to see it sync.
You get to practice replacing a failed drive in a raid set too. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
lutel Tux's lil' helper
Joined: 19 Oct 2003 Posts: 110 Location: Pomroczna
|
Posted: Tue Jun 17, 2014 10:31 am Post subject: |
|
|
Hi NeddySeagoon,
I'll give it a try as you said as soon I'll get access to it. I've got some questions though, what do you think...?
- are there any performance drawbacks from going to 0.9 metadata?
- i've made partitions with parted with type "RAID partition" - so I can't mount /dev/sd... to copy data, so i guess the only way to replace metadata with your guide is to create another array, so array numbering will change?
- i have no access to check it, but can i use md=1,/dev/sda1,/dev/sdb1 and so on to mount arrays with 1.2 metadata?
- with append md=.... - is it possible to define spare?, so something like md=1,/dev/sda1,/dev/sdb1,/dev,sdc1 (last one becoming spare)?
If spare can not be set during booting, is it possible to set it up afterwards? |
|
Back to top |
|
|
szatox Advocate
Joined: 27 Aug 2013 Posts: 3150
|
Posted: Tue Jun 17, 2014 1:14 pm Post subject: |
|
|
lutel, the major change going do old metadata format is the place where metadata is stored. It will be at the end of disk now, so it's size is limited to smaller value than newer format, but it's presence seems to be completly ignored by filesystem and partition table, so you can have grub on partition created on top of raid with 0.9 metadata.
Also, AFAIR kernel has direct support for auto-assembling 0.9 raid while it does not have direct support for 1.2 metadata, and that's why root on newer raid requires initramfs.
Finaly, NeddySeagoon never suggested you to mount dev/sd*. There is no reason to do that. Hey, /dev/sd* don't even exist for you anymore as they are enslaved by raid. If you have raid1 made of 2 hard drives, you have to remove one drive from array first (so you can finaly use it directly). You consider this disk free. You don't care for any data you have there, you still have working raid even though you have just crippled it a bit. So, you use that scavenged drive you have just remove from old raid to creat new raid, the one you will be using later. Yes, raid number will change. I suppose you can fix it later if you need, but it must have different number right now for you still have old raid active. Copy files from old raid to new raid, then destroy old raid, add second drive to new raid and in (most likely) a few hours DM will repair your raid. you can continue using your pc in that time though |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54300 Location: 56N 3W
|
Posted: Tue Jun 17, 2014 7:53 pm Post subject: |
|
|
lutel,
What szatox just said :)
Raid sets are supposed to work with a missing drive - except raid0
You are taking advantage of that. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|