Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Lvm dm-cache initialization from initramfs
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9883
Location: almost Mile High in the USA

PostPosted: Sun Nov 07, 2021 1:49 am    Post subject: Lvm dm-cache initialization from initramfs Reply with quote

Well, I shot myself for not testing this earlier, but not sure if it really mattered as far as I can tell.

My homebrew initramfs fails to initialize dm-cache volumes.

After a crash and hard reboot, the initramfs would not detect volumes that were cached. lvdisplay would report that the volumes were not ready or something to that extent. lvchange -a y VG0 would give me an error about dm_cache_smq unable to be modprobed.

So I grabbed another kernel that I had. Actually it's another Gentoo install on a hard drive that would boot into the machine. Using that disk, it easily pulled up all the LV's on the disk, cache enabled and all. Unfortunately it's going through some sort of LVM sync - like mdadm sync but seems much much slower... in fact it seems the percentage done seems to keep going down, when

lvs -a

the Cpy%Sync column starts near 99% and goes down to 92% after a while... and I mean a while - like hours.

However I can mount these with the other root so the data is not lost, just that my initramfs cannot mount these volumes!

So I know that this disk's kernel works, so I tried copying the kernel and modules and use those for my initramfs. As some of the modules are modules, I had to let the initramfs fail, drop to a shell, and manually modprobe/insmod the modules. I got to the point where the initramfs shell was able to find and mount ONE of the three logical volumes on that array, and it enabled the cache for it too. It still needed to do that Cpy%Sync that keeps going down too. Note that I had to "lvm vgmknodes" a few times to get the metadata and cache volumes to show up, but just that one LV came up.

Still no sign of the other two logical volumes however. One of those two are critical - the root volume - if I got that one working, then I could just get things started again and let the sync finish in its own merry time, but nope...

So I find it weird, why was only one volume good enough to show up and be usable? I'm thinking it's some sort of race condition, but what?

Next course of actions: Unfortunately this is on the clock. This is my main mail server and I can't have it down much longer. I've been trying to let it finish whatever it needed to do with the Cpy%Sync but not sure if it's actually completing or not. If it does complete, I could just disable the lvm cache and I think I could recover from there, but for now whatever I do, I have to wait for the sync to finish.

But there are other drastic measures I could do...what if I just disconnect the cache device and force removal? I don't know what it's doing right now, it can't be doing writeback, not sure what it's doing right now. Supposedly if all the cache data is redundant I should be able to simply sever the drive and use the main volumes as-is.

Any suggestions as to course of action at this point to hasten recovery time? The main "machine" I need to get back up is actually a virtual machine inside one of the logical volumes. I suppose one option is copy the VM images to another machine, but it's still a lot of work...

(While it is time critical, nobody's getting fired. It's just that it screwed up my home network infrastructure real bad.)

---

Update

Looks like CacheDirtyBlocks is some huge number. I think there's over a million or so total dirty blocks and it's slowly writing them back at a count of like 40 per second...

I hope this is a LVM allocation unit and not 4K or even 512 byte block, one allocation unit is 64-96K so that would be like 2.5MB-3.8MB/sec which is horrid compared for even rust spinner SATA 2T disks...

Still have one more volume to flush after these, was hoping to serialize them but it doesn't look like it affects things too much with head locality thrash.

---

Update

I let the caches drain overnight and then removed the caches. Then restarted the machine.

... it booted up fine!

So it looks like my initramfs does not set up caches properly or is missing something to allow this to seamlessly happen, but I don't know what. It does handle root on LVM(without cache) over RAID5 properly.

Now the question is... what's missing... or anyone know of some other initramfs that does handle cached lvm properly? Anyone using root on cached lvm (on RAID, but that I think I can handle separately...)?
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
Zucca
Moderator
Moderator


Joined: 14 Jun 2007
Posts: 3896
Location: Rasi, Finland

PostPosted: Mon Nov 08, 2021 10:41 am    Post subject: Reply with quote

I would assume genkernel's initramfs can handle lvm cache as dm-cache is loaded by default.

Btw, have you explicitly set up your raid5 under your VG via mdraid, or have you created your LV(s) with lvm and told it to make it raid5?
_________________
..: Zucca :..

My gentoo installs:
init=/sbin/openrc-init
-systemd -logind -elogind seatd

Quote:
I am NaN! I am a man!
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9883
Location: almost Mile High in the USA

PostPosted: Mon Nov 08, 2021 4:45 pm    Post subject: Reply with quote

Legacy install, the raid5 is mdraid and the mdraid is used for the pv/vgs.

Probably have to grab that and see what makes it tick though now sort of gunshy of experimenting with it again in fear it will crash again and take over a day to recover.

---

I think I got all the kernel modules. My fear is that I'm missing some tool and incantation in initramfs. Right now the suspect is that I'm using an incomplete udev implementation - a homemade adhoc solution much like how Neddy does things (for the most part). After taking a quick look at the linuxrc script, it appears that genkernel initramfs uses the full udev suite. I sure hope I do not need to go that route :(

---

I ended up doing cache experiments once more... damn the cache makes things run so much faster, spoiled by the cache.
Cut a new initramfs with a few more device mapper tools and hopefully this time there will be enough tools to setup up lvm-dmcache...

Famous last words...
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum