john.newman Tux's lil' helper
Joined: 17 Oct 2009 Posts: 85
|
Posted: Thu Nov 21, 2013 2:00 am Post subject: Machine hangs on S3 resume only with raid10 array active. ? |
|
|
Hello all,
This is something that has been bugging me for a while and I'd like to revisit it. I have been using software raid (mdadm) for the past couple years - I currently have the following setup:
Code: | ~ $ cat /proc/mdstat
Personalities : [raid1] [raid10] [multipath]
md5 : active raid10 sde[0] sdd[5] sdf[6] sdc[4]
5860530176 blocks super 1.2 512K chunks 2 near-copies [4/4] [UUUU]
md3 : active raid1 sda3[0] sdb3[1]
4194240 blocks [2/2] [UU]
md2 : active raid1 sda2[0] sdb2[1]
245732608 blocks [2/2] [UU]
md1 : active raid1 sda1[0] sdb1[1]
131008 blocks [2/2] [UU] |
So md{1,2,3} are raid1 partitions out of a 2xSSD providing /boot, / , and swap ; while md5 is a 4xHD raid10 providing /dev/vg1/lv{a bunch of stuff}.
When I use S3 suspend to ram, I have to run:
Code: | # /etc/init.d/{several services using handles on md5} stop
# vgchange -a n
# mdadm --manage --stop /dev/md5
# pm-suspend --quirk-none |
And everything comes back perfectly fine. If I do not stop the raid array, the system posts and appears alive, but before too long it will hang and I have to use the power button. SSH and the usb input does not work...
I've looked around but I can't find any recent kernel bugs in the raid stack, and personally I can't see how it would be an issue with the raid, but possibly something else - a side effect of the raid perhaps. Does anybody have any ideas as to what could be going on? I'm sure there's a kernel trace but I don't know how to get it without seeing it.
I can try to pair it down a bit further, but the problem with a trial and error approach here is that runs the risk of data loss, and many times the raid10 will end up having to resync, which takes about 6 hours...
Thanks |
|