Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
mdadm: can I back out of a raid 5 grow? (0%) [solved]
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo
View previous topic :: View next topic  
Author Message
russK
l33t
l33t


Joined: 27 Jun 2006
Posts: 665

PostPosted: Sat Oct 22, 2016 7:33 pm    Post subject: mdadm: can I back out of a raid 5 grow? (0%) [solved] Reply with quote

I need some help from any mdadm experts, since I was shooting from the hip and not paying much attention.

I have a RAID5 with 7 disks, and I started adding an 8th disk (the disks are not very big in the grand scheme, ~640GB). I was going to grow the raid, but as you will see it is not progressing, I think it is waiting for me to backup up critical data.

If I can somehow abort this operation that is not progressing anyway, I might be happier reshaping to RAID6. This is what I did:

Code:
# mdadm --add /dev/md127 /dev/sdf1
mdadm: added /dev/sdf1
ksp ~ # cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [raid0] [raid1] [raid10] [linear]
md127 : active raid5 sdf1[7](S) sdb1[0] sdc1[5] sdd1[1] sde1[3] sda1[4] sdg1[6] sdh1[2]
      3750775296 blocks level 5, 64k chunk, algorithm 2 [7/7] [UUUUUUU]


At this point sdf1 was spare. I think that was my opportunity to reshape as RAID6. I did this instead:
Code:
# mdadm --grow --raid-devices=8 /dev/md127
mdadm: Need to backup 2688K of critical section..
ksp ~ # cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [raid0] [raid1] [raid10] [linear]
md127 : active raid5 sdf1[7] sdb1[0] sdc1[5] sdd1[1] sde1[3] sda1[4] sdg1[6] sdh1[2]
      3750775296 blocks super 0.91 level 5, 64k chunk, algorithm 2 [8/8] [UUUUUUUU]
      [>....................]  reshape =  0.0% (1/625129216) finish=3255881.3min speed=0K/sec


I saw the message about needing to back up 2688K of critical section.. , but I was thinking it was just informational, not as in "hey dummy, you should tell me where to back up the critical data before this can commence ...."

I was assuming that the sink / reshape would move along.

Then I did something that I might regret. Since I have LVM on the raid device, I did a resize:
Code:
# pvs
  PV             VG   Fmt  Attr PSize   PFree
  /dev/md127     vg5  lvm2 a--    3.49t  1.04t
# pvresize /dev/md127
  Physical volume "/dev/md127" changed
  1 physical volume(s) resized / 0 physical volume(s) not resized
# pvs
  PV             VG   Fmt  Attr PSize   PFree
  /dev/md127     vg5  lvm2 a--    3.49t  1.04t


The pvresize said it did something, but I don't think it really changed. Perhaps the md will not actually be larger until the reshape is complete. I suspect the reshape is waiting for me to back up the critical data, that's why it is hanging around at 0%.

Does anyone know if I can either abort the reshape, or change it to instead reshape as RAID6 ?

Thanks very much

russk


Last edited by russK on Mon Oct 24, 2016 3:48 am; edited 1 time in total
Back to top
View user's profile Send private message
frostschutz
Advocate
Advocate


Joined: 22 Feb 2005
Posts: 2977
Location: Germany

PostPosted: Sat Oct 22, 2016 8:01 pm    Post subject: Reply with quote

stop the array then mdadm --assemble --update=revert-reshape (or was it reshape-revert?)



Do you have any security shenanigans? apparmor, selinux, systemd? they sometimes tend to mess with mdadm reshape. It's a re-occuring issue on the linux-raid mailing list. if in doubt try kicking it off from a SystemRescueCD or similar environment. Also try with latest kernel, latest version of mdadm (even git version), in case any bugs got fixed in the meantime.



Quote:
I saw the message about needing to back up 2688K of critical section.. , but I was thinking it was just informational, not as in "hey dummy, you should tell me where to back up the critical data before this can commence ...."


It IS informal. If it actually required a backup file it would tell you so.
Back to top
View user's profile Send private message
russK
l33t
l33t


Joined: 27 Jun 2006
Posts: 665

PostPosted: Sat Oct 22, 2016 9:25 pm    Post subject: Reply with quote

frostschutz wrote:
stop the array then mdadm --assemble --update=revert-reshape (or was it reshape-revert?)


Thanks, frostschutz.

I see in the announcement for mdadm 3.3 release, it is "--assemble --update=revert-reshape"
https://lwn.net/Articles/565591/

frostschutz wrote:
Do you have any security shenanigans? apparmor, selinux, systemd? they sometimes tend to mess with mdadm reshape. It's a re-occuring issue on the linux-raid mailing list. if in doubt try kicking it off from a SystemRescueCD or similar environment. Also try with latest kernel, latest version of mdadm (even git version), in case any bugs got fixed in the meantime.

Of apparmor, selinux, systemd, I have only systemd (don't laugh).
I have sys-fs-mdadm-3.3.1-r2 and now kernel 4.4.26, so I think these are up to date.

It has been hours and it has not progressed one little bit, which I like in this case, the release note says update-revert can be used to revert a reshape that "has just been started". Funny how the mdadm --help and the man page say nothing about revert-reshape.

I am going to try stopping the raid and doing the revert-reshape, but I have to unmount some filesystems which involves logging out and stopping some things.

Thanks.
Back to top
View user's profile Send private message
russK
l33t
l33t


Joined: 27 Jun 2006
Posts: 665

PostPosted: Sun Oct 23, 2016 9:13 pm    Post subject: Reply with quote

Err, umm, that was really a bad result. mdadm was not pleased about not having the backup file, it was insisting that I provide one, which I did not have.
The array refuses to start now, it believes that 8 devices belong in the array but it knows things are not right. It is correct in that I had added sdf1 initially as a spare 8th device and then grew the array devices from 7 to 8.

My patient is now code blue on the gurney.

Question: is this the right way to proceed?
I believe my only potentially saving grace is that the reshape never seemed to actually do anything in terms of moving data around. I suspect if I can recreate the layout with 7, I could have a working array again. I have evidence in my logs from days ago about the previous RAID conf with 7 devices. I plan on taking an image of each device to practicing trying to rebuild the array with 7, not 8. I don't want to work with the actual devices until I know that I can successfully rebuild the array.

This is what I have from the logs:
Code:
Oct 22 12:45:37 ksp kernel: md/raid:md127: raid level 5 active with 7 out of 7 devices, algorithm 2
Oct 22 12:45:37 ksp kernel: RAID conf printout:
Oct 22 12:45:37 ksp kernel:  --- level:5 rd:7 wd:7
Oct 22 12:45:37 ksp kernel:  disk 0, o:1, dev:sdb1
Oct 22 12:45:37 ksp kernel:  disk 1, o:1, dev:sdd1
Oct 22 12:45:37 ksp kernel:  disk 2, o:1, dev:sdh1
Oct 22 12:45:37 ksp kernel:  disk 3, o:1, dev:sde1
Oct 22 12:45:37 ksp kernel:  disk 4, o:1, dev:sda1
Oct 22 12:45:37 ksp kernel:  disk 5, o:1, dev:sdc1
Oct 22 12:45:37 ksp kernel:  disk 6, o:1, dev:sdg1
Oct 22 12:45:37 ksp kernel: md127: detected capacity change from 0 to 3840793903104


And then later when adding sdf1:
Code:
Oct 22 13:51:15 ksp kernel: RAID conf printout:
Oct 22 13:51:15 ksp kernel:  --- level:5 rd:8 wd:8
Oct 22 13:51:15 ksp kernel:  disk 0, o:1, dev:sdb1
Oct 22 13:51:15 ksp kernel:  disk 1, o:1, dev:sdd1
Oct 22 13:51:15 ksp kernel:  disk 2, o:1, dev:sdh1
Oct 22 13:51:15 ksp kernel:  disk 3, o:1, dev:sde1
Oct 22 13:51:15 ksp kernel:  disk 4, o:1, dev:sda1
Oct 22 13:51:15 ksp kernel:  disk 5, o:1, dev:sdc1
Oct 22 13:51:15 ksp kernel:  disk 6, o:1, dev:sdg1
Oct 22 13:51:15 ksp kernel:  disk 7, o:1, dev:sdf1
Oct 22 13:51:15 ksp kernel: md: reshape of RAID array md127
Oct 22 13:51:15 ksp kernel: md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
Oct 22 13:51:15 ksp kernel: md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for reshape.
Oct 22 13:51:15 ksp kernel: md: using 128k window, over a total of 625129216k.


I believe that RAID conf above is from when I had used this mdadm command as shown in my first post:
Code:
# mdadm --grow --raid-devices=8 /dev/md127
mdadm: Need to backup 2688K of critical section..
ksp ~ # cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [raid0] [raid1] [raid10] [linear]
md127 : active raid5 sdf1[7] sdb1[0] sdc1[5] sdd1[1] sde1[3] sda1[4] sdg1[6] sdh1[2]
      3750775296 blocks super 0.91 level 5, 64k chunk, algorithm 2 [8/8] [UUUUUUUU]
      [>....................]  reshape =  0.0% (1/625129216) finish=3255881.3min speed=0K/sec



Thanks for any pointers.
Back to top
View user's profile Send private message
frostschutz
Advocate
Advocate


Joined: 22 Feb 2005
Posts: 2977
Location: Germany

PostPosted: Sun Oct 23, 2016 9:41 pm    Post subject: Reply with quote

mdadm --examine /dev/sd* ?

Quote:
I plan on taking an image of each device to practicing trying to rebuild the array with 7, not 8.


Imaging the disks is not wrong, but if they're not broken, an overlay would suffice.

https://raid.wiki.kernel.org/index.php/Recovering_a_failed_software_RAID#Making_the_harddisks_read-only_using_an_overlay_file

Something like this could work.

Code:

mdadm --create /dev/md42 --assume-clean --metadata=0.90 --level=5 --raid-devices=7 --chunk=64 --layout=ls /dev/overlay/sd{b,d,h,e,a,c,g}1


Recreating is chancy business, you have to get all the settings right (mdadm defaults change over time and also depending on device size / grows you performed). So the command is a lot longer than when initially creating a new raid without data. For 1.2 metadata you'd also have to specify a data offset... missing here since you're still using archaic 0.90 metadata.
Back to top
View user's profile Send private message
russK
l33t
l33t


Joined: 27 Jun 2006
Posts: 665

PostPosted: Sun Oct 23, 2016 10:49 pm    Post subject: Reply with quote

Thanks for the overlay tip, I'm studying it now. In the meantime,

frostschutz wrote:
mdadm --examine /dev/sd* ?

Code:
# mdadm --examine /dev/sd{b,d,h,e,a,c,g,f}1
/dev/sdb1:
          Magic : a92b4efc
        Version : 0.91.00
           UUID : 20d9c312:4b4bbcfb:5f1eb124:da2485ac
  Creation Time : Wed Sep 23 22:17:22 2009
     Raid Level : raid5
  Used Dev Size : 625129216 (596.17 GiB 640.13 GB)
     Array Size : 3750775296 (3577.02 GiB 3840.79 GB)
   Raid Devices : 7
  Total Devices : 8
Preferred Minor : 127

  Reshape pos'n : 0
  Delta Devices : -1 (8->7)

    Update Time : Sun Oct 23 00:35:02 2016
          State : clean
 Active Devices : 8
Working Devices : 8
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 1897b1fb - correct
         Events : 16510576

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     0       8       17        0      active sync   /dev/sdb1

   0     0       8       17        0      active sync   /dev/sdb1
   1     1       8       49        1      active sync   /dev/sdd1
   2     2       8      113        2      active sync   /dev/sdh1
   3     3       8       65        3      active sync   /dev/sde1
   4     4       8        1        4      active sync   /dev/sda1
   5     5       8       33        5      active sync   /dev/sdc1
   6     6       8       97        6      active sync   /dev/sdg1
   7     7       8       81        7      active sync   /dev/sdf1
/dev/sdd1:
          Magic : a92b4efc
        Version : 0.91.00
           UUID : 20d9c312:4b4bbcfb:5f1eb124:da2485ac
  Creation Time : Wed Sep 23 22:17:22 2009
     Raid Level : raid5
  Used Dev Size : 625129216 (596.17 GiB 640.13 GB)
     Array Size : 3750775296 (3577.02 GiB 3840.79 GB)
   Raid Devices : 7
  Total Devices : 8
Preferred Minor : 127

  Reshape pos'n : 0
  Delta Devices : -1 (8->7)

    Update Time : Sun Oct 23 00:35:02 2016
          State : active
 Active Devices : 8
Working Devices : 8
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 1897b21c - correct
         Events : 16510576

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     1       8       49        1      active sync   /dev/sdd1

   0     0       8       17        0      active sync   /dev/sdb1
   1     1       8       49        1      active sync   /dev/sdd1
   2     2       8      113        2      active sync   /dev/sdh1
   3     3       8       65        3      active sync   /dev/sde1
   4     4       8        1        4      active sync   /dev/sda1
   5     5       8       33        5      active sync   /dev/sdc1
   6     6       8       97        6      active sync   /dev/sdg1
   7     7       8       81        7      active sync   /dev/sdf1
/dev/sdh1:
          Magic : a92b4efc
        Version : 0.91.00
           UUID : 20d9c312:4b4bbcfb:5f1eb124:da2485ac
  Creation Time : Wed Sep 23 22:17:22 2009
     Raid Level : raid5
  Used Dev Size : 625129216 (596.17 GiB 640.13 GB)
     Array Size : 3750775296 (3577.02 GiB 3840.79 GB)
   Raid Devices : 7
  Total Devices : 8
Preferred Minor : 127

  Reshape pos'n : 0
  Delta Devices : -1 (8->7)

    Update Time : Sun Oct 23 00:35:02 2016
          State : active
 Active Devices : 8
Working Devices : 8
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 1897b25e - correct
         Events : 16510576

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     2       8      113        2      active sync   /dev/sdh1

   0     0       8       17        0      active sync   /dev/sdb1
   1     1       8       49        1      active sync   /dev/sdd1
   2     2       8      113        2      active sync   /dev/sdh1
   3     3       8       65        3      active sync   /dev/sde1
   4     4       8        1        4      active sync   /dev/sda1
   5     5       8       33        5      active sync   /dev/sdc1
   6     6       8       97        6      active sync   /dev/sdg1
   7     7       8       81        7      active sync   /dev/sdf1
/dev/sde1:
          Magic : a92b4efc
        Version : 0.91.00
           UUID : 20d9c312:4b4bbcfb:5f1eb124:da2485ac
  Creation Time : Wed Sep 23 22:17:22 2009
     Raid Level : raid5
  Used Dev Size : 625129216 (596.17 GiB 640.13 GB)
     Array Size : 3750775296 (3577.02 GiB 3840.79 GB)
   Raid Devices : 7
  Total Devices : 8
Preferred Minor : 127

  Reshape pos'n : 0
  Delta Devices : -1 (8->7)

    Update Time : Sun Oct 23 00:35:02 2016
          State : active
 Active Devices : 8
Working Devices : 8
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 1897b230 - correct
         Events : 16510576

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     3       8       65        3      active sync   /dev/sde1

   0     0       8       17        0      active sync   /dev/sdb1
   1     1       8       49        1      active sync   /dev/sdd1
   2     2       8      113        2      active sync   /dev/sdh1
   3     3       8       65        3      active sync   /dev/sde1
   4     4       8        1        4      active sync   /dev/sda1
   5     5       8       33        5      active sync   /dev/sdc1
   6     6       8       97        6      active sync   /dev/sdg1
   7     7       8       81        7      active sync   /dev/sdf1
/dev/sda1:
          Magic : a92b4efc
        Version : 0.91.00
           UUID : 20d9c312:4b4bbcfb:5f1eb124:da2485ac
  Creation Time : Wed Sep 23 22:17:22 2009
     Raid Level : raid5
  Used Dev Size : 625129216 (596.17 GiB 640.13 GB)
     Array Size : 3750775296 (3577.02 GiB 3840.79 GB)
   Raid Devices : 7
  Total Devices : 8
Preferred Minor : 127

  Reshape pos'n : 0
  Delta Devices : -1 (8->7)

    Update Time : Sun Oct 23 00:35:02 2016
          State : active
 Active Devices : 8
Working Devices : 8
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 1897b1f2 - correct
         Events : 16510576

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     4       8        1        4      active sync   /dev/sda1

   0     0       8       17        0      active sync   /dev/sdb1
   1     1       8       49        1      active sync   /dev/sdd1
   2     2       8      113        2      active sync   /dev/sdh1
   3     3       8       65        3      active sync   /dev/sde1
   4     4       8        1        4      active sync   /dev/sda1
   5     5       8       33        5      active sync   /dev/sdc1
   6     6       8       97        6      active sync   /dev/sdg1
   7     7       8       81        7      active sync   /dev/sdf1
/dev/sdc1:
          Magic : a92b4efc
        Version : 0.91.00
           UUID : 20d9c312:4b4bbcfb:5f1eb124:da2485ac
  Creation Time : Wed Sep 23 22:17:22 2009
     Raid Level : raid5
  Used Dev Size : 625129216 (596.17 GiB 640.13 GB)
     Array Size : 3750775296 (3577.02 GiB 3840.79 GB)
   Raid Devices : 7
  Total Devices : 8
Preferred Minor : 127

  Reshape pos'n : 0
  Delta Devices : -1 (8->7)

    Update Time : Sun Oct 23 00:35:02 2016
          State : active
 Active Devices : 8
Working Devices : 8
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 1897b214 - correct
         Events : 16510576

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     5       8       33        5      active sync   /dev/sdc1

   0     0       8       17        0      active sync   /dev/sdb1
   1     1       8       49        1      active sync   /dev/sdd1
   2     2       8      113        2      active sync   /dev/sdh1
   3     3       8       65        3      active sync   /dev/sde1
   4     4       8        1        4      active sync   /dev/sda1
   5     5       8       33        5      active sync   /dev/sdc1
   6     6       8       97        6      active sync   /dev/sdg1
   7     7       8       81        7      active sync   /dev/sdf1
/dev/sdg1:
          Magic : a92b4efc
        Version : 0.91.00
           UUID : 20d9c312:4b4bbcfb:5f1eb124:da2485ac
  Creation Time : Wed Sep 23 22:17:22 2009
     Raid Level : raid5
  Used Dev Size : 625129216 (596.17 GiB 640.13 GB)
     Array Size : 3750775296 (3577.02 GiB 3840.79 GB)
   Raid Devices : 7
  Total Devices : 8
Preferred Minor : 127

  Reshape pos'n : 0
  Delta Devices : -1 (8->7)

    Update Time : Sun Oct 23 00:35:02 2016
          State : active
 Active Devices : 8
Working Devices : 8
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 1897b256 - correct
         Events : 16510576

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     6       8       97        6      active sync   /dev/sdg1

   0     0       8       17        0      active sync   /dev/sdb1
   1     1       8       49        1      active sync   /dev/sdd1
   2     2       8      113        2      active sync   /dev/sdh1
   3     3       8       65        3      active sync   /dev/sde1
   4     4       8        1        4      active sync   /dev/sda1
   5     5       8       33        5      active sync   /dev/sdc1
   6     6       8       97        6      active sync   /dev/sdg1
   7     7       8       81        7      active sync   /dev/sdf1
/dev/sdf1:
          Magic : a92b4efc
        Version : 0.91.00
           UUID : 20d9c312:4b4bbcfb:5f1eb124:da2485ac
  Creation Time : Wed Sep 23 22:17:22 2009
     Raid Level : raid5
  Used Dev Size : 625129216 (596.17 GiB 640.13 GB)
     Array Size : 3750775296 (3577.02 GiB 3840.79 GB)
   Raid Devices : 7
  Total Devices : 8
Preferred Minor : 127

  Reshape pos'n : 0
  Delta Devices : -1 (8->7)

    Update Time : Sun Oct 23 00:35:02 2016
          State : active
 Active Devices : 8
Working Devices : 8
 Failed Devices : 0
  Spare Devices : 0
       Checksum : 1897b248 - correct
         Events : 16510576

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     7       8       81        7      active sync   /dev/sdf1

   0     0       8       17        0      active sync   /dev/sdb1
   1     1       8       49        1      active sync   /dev/sdd1
   2     2       8      113        2      active sync   /dev/sdh1
   3     3       8       65        3      active sync   /dev/sde1
   4     4       8        1        4      active sync   /dev/sda1
   5     5       8       33        5      active sync   /dev/sdc1
   6     6       8       97        6      active sync   /dev/sdg1
   7     7       8       81        7      active sync   /dev/sdf1
Back to top
View user's profile Send private message
russK
l33t
l33t


Joined: 27 Jun 2006
Posts: 665

PostPosted: Mon Oct 24, 2016 12:01 am    Post subject: Reply with quote

I performed this test scenario with encouraging results.
Using some spare very fast storage to re-enact what happened with the larger array:

Code:
# for l in b d h e a c g f ; do lvcreate -n xd${l}1 -L 4G vg1 ; done

Code:
# mdadm --create /dev/md3 --metadata=0.90 --level=5 --raid-devices=7 --chunk=64 --layout=ls /dev/mapper/vg1-xd{b,d,h,e,a,c,g}1

Code:
# mdadm --add /dev/md3 /dev/mapper/vg1-xdf1

Code:
# mdadm --grow --raid-devices=8 /dev/md3

At this point, the 8 device test array was doing the same thing as the larger original array, it claimed it was reshaping, but noted the message about needing to back up critical data and did not actually progress.

So I created a LVM on this array and a volume to format a filesystem, mounted and then created a file, just to know whether this can be done without destroying data.
Code:
#  pvcreate /dev/md3
# vgcreate vg3 /dev/md3
# lvcreate -n testfs -L 3G vg3
# mkfs -t ext4 /dev/mapper/vg3-testfs
# mount /dev/mapper/vg3-testfs /mnt/tmp
# date > /mnt/tmp/somefile.txt


Then I stopped the test array and did the '--assemble --update=revert-reshape' and it failed to run.
Code:
# mdadm --assemble --update=revert-reshape /dev/md3 /dev/mapper/vg1-xd{b,d,h,e,a,c,g,f}1
mdadm: Failed to restore critical section for reshape, sorry.
       Possibly you needed to specify the --backup-file
ksp ~ # cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [raid0] [raid1] [raid10] [linear]
md3 : inactive dm-15[6](S) dm-16[7](S) dm-12[3](S) dm-13[4](S) dm-14[5](S) dm-11[2](S) dm-10[1](S) dm-9[0](S)
      33553920 blocks super 0.91
       
unused devices: <none>


Then I stopped it again and did the re-create like this:
Code:
# mdadm --stop /dev/md3
mdadm: stopped /dev/md3
ksp ~ # mdadm --create /dev/md3 --assume-clean --metadata=0.90 --level=5 --raid-devices=7 --chunk=64 --layout=ls /dev/mapper/vg1-xd{b,d,h,e,a,c,g}1
mdadm: /dev/mapper/vg1-xdb1 appears to be part of a raid array:
       level=raid5 devices=8 ctime=Sun Oct 23 19:03:27 2016
mdadm: /dev/mapper/vg1-xdd1 appears to be part of a raid array:
       level=raid5 devices=8 ctime=Sun Oct 23 19:03:27 2016
mdadm: /dev/mapper/vg1-xdh1 appears to be part of a raid array:
       level=raid5 devices=8 ctime=Sun Oct 23 19:03:27 2016
mdadm: /dev/mapper/vg1-xde1 appears to be part of a raid array:
       level=raid5 devices=8 ctime=Sun Oct 23 19:03:27 2016
mdadm: /dev/mapper/vg1-xda1 appears to be part of a raid array:
       level=raid5 devices=8 ctime=Sun Oct 23 19:03:27 2016
mdadm: /dev/mapper/vg1-xdc1 appears to be part of a raid array:
       level=raid5 devices=8 ctime=Sun Oct 23 19:03:27 2016
mdadm: /dev/mapper/vg1-xdg1 appears to be part of a raid array:
       level=raid5 devices=8 ctime=Sun Oct 23 19:03:27 2016
Continue creating array? y
mdadm: array /dev/md3 started.
ksp ~ # cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [raid0] [raid1] [raid10] [linear]
md3 : active raid5 dm-15[6] dm-14[5] dm-13[4] dm-12[3] dm-11[2] dm-10[1] dm-9[0]
      25165440 blocks level 5, 64k chunk, algorithm 2 [7/7] [UUUUUUU]
     
unused devices: <none>
ksp ~ # pvscan
  PV /dev/md3         VG vg3   lvm2 [24.00 GiB / 21.00 GiB free]
  PV /dev/nvme0n1p3   VG vg1   lvm2 [225.00 GiB / 44.00 GiB free]
  Total: 2 [248.99 GiB] / in use: 2 [248.99 GiB] / in no VG: 0 [0   ]
ksp ~ # mount /dev/mapper/vg3-testfs /mnt/tmp
ksp ~ # cat /mnt/tmp/somefile.txt
Sun Oct 23 19:21:48 EDT 2016


I think this is very encouraging. I'm looking at doing this with the original array with overlays.
Back to top
View user's profile Send private message
russK
l33t
l33t


Joined: 27 Jun 2006
Posts: 665

PostPosted: Mon Oct 24, 2016 3:47 am    Post subject: Reply with quote

I'm relieved and happy to report that the re-creation worked on the overlays and then again for real. The array is humming along with 7 disks as if nothing every happened. I'm very lucky to have dodged the bullet that I fired at my own foot. Don't try this at home kids.


Thanks
Back to top
View user's profile Send private message
frostschutz
Advocate
Advocate


Joined: 22 Feb 2005
Posts: 2977
Location: Germany

PostPosted: Mon Oct 24, 2016 11:22 am    Post subject: Reply with quote

Good. If you want to try growing it again, use latest kernel / and newest mdadm (v3.4 or better yet, the current git build).

Also, this isn't so simple but you should consider moving to 1.2 metadata arrays when there is a chance. 0.90 has lots of limitations and issues.
Back to top
View user's profile Send private message
russK
l33t
l33t


Joined: 27 Jun 2006
Posts: 665

PostPosted: Fri Nov 04, 2016 9:08 pm    Post subject: Reply with quote

I had an interesting development on this front I thought I would post in case it helps anyone in the future.

I decided to go ahead with the reshape as RAID6 after making sure I had good backups and such and got to a good stance in terms of the current usage of the array.

So I did the reshape like this:
Code:
mdadm --add /dev/md127 /dev/sdf1
mdadm --grow /dev/md127 --level=6 --backup-file=/root/backup-md127


The reshape was proceeding as I could see in /proc/mdstat (as opposed to the original grow without the backup-file argument). It was actually going to take a couple days and went along to somewhere around 90% after about 2.5 days or so. At that point I needed to reboot for unrelated reasons. I thought to myself, "no problem, it will resume reshaping automatically."
Nope. After reboot the array just sits all stopped.
Code:
# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [raid0] [raid1] [raid10] [linear]
md127 : inactive sdb1[0](S) sdc1[5](S) sda1[4](S) sdf1[7](S) sdd1[1](S) sdh1[2](S) sdg1[6](S) sde1[3](S)
      5232087872 blocks super 0.91


So I carefully researched a little and did another stop and an assemble like this:
Code:
# mdadm --stop /dev/md127
# mdadm --assemble --backup-file=/root/backup-md127 /dev/md127 /dev/sd{b,d,h,e,a,c,g,f}1
mdadm: restoring critical section
mdadm: /dev/md127 has been started with 8 drives.
ksp ~ # cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [raid0] [raid1] [raid10] [linear]
md127 : active raid6 sdb1[0] sdf1[7] sdg1[6] sdc1[5] sda1[4] sde1[3] sdh1[2] sdd1[1]
      3750775296 blocks super 0.91 level 6, 64k chunk, algorithm 18 [8/7] [UUUUUUU_]
      [==================>..]  reshape = 92.9% (581083136/625129216) finish=229406.6min speed=0K/sec
      bitmap: 0/5 pages [0KB], 65536KB chunk


This seemed promising, but notice the speed there, and as I watched I could see that nothing was happening. I was suspicious so I started using iostat like this:
Code:
# iostat -x 1 /dev/sd{b,d,h,e,a,c,g,f}
Linux 4.4.26-gentoo (ksp)    11/04/2016    _x86_64_   (8 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.89    0.04    0.77    0.25    0.00   97.05

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.28     0.00    0.30    0.05     3.94     3.00    39.23     0.00    6.44    5.88    9.65   4.44   0.16
sdb               0.29     0.00    0.28    0.05     3.03     3.00    36.58     0.00    9.89   11.40    1.85   6.38   0.21
sdd               0.24     0.00    0.32    0.05     3.97     3.00    37.03     0.00   11.03   12.46    2.22   2.92   0.11
sde               0.26     0.00    0.28    0.05     3.73     3.00    40.91     0.00    3.60    4.02    1.39   2.24   0.07
sdf               0.30     0.00    0.22    0.05     2.85     3.00    43.68     0.00    2.14    2.41    1.03   1.92   0.05
sdg               0.29     0.00    0.24    0.05     2.89     3.00    40.88     0.00    8.53   10.02    1.85   2.51   0.07
sdh               0.30     0.00    0.34    0.05     5.02     3.00    40.57     0.00    2.63    2.86    1.10   1.50   0.06
sdc               0.26     0.00    0.25    0.05     2.82     3.00    38.79     0.00    8.04    9.42    1.50   6.38   0.19


So best I could figure, the reshape was in fact not doing anything. I googled around and found someone with a slow reshape had done a command like this, which I then executed:
Code:
# echo max > /sys/block/md127/md/sync_max
(I was not worried about the I/O bandwidth since nothing else was using SATA in this box)

At that point the reshape did resume in earnest, it was quite entertaining to do this:
Code:
# watch -n.1 iostat -x  /dev/sd{b,d,h,e,a,c,g,f}


Hopefully someday someone will find this useful, like my favorite Linus Torvalds quote:
Only wimps use tape backup: real men just upload their important stuff on ftp, and let the rest of the world mirror it ;)

Regards
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum