View previous topic :: View next topic |
Author |
Message |
Goverp Advocate
Joined: 07 Mar 2007 Posts: 2200
|
Posted: Sun Apr 14, 2024 4:17 pm Post subject: kexec ignores me! |
|
|
Reading the wiki and the kernel docs, I think I'd like to use "kexec" to speed rebooting my system when I'm playing with my initramfs "init" script.
So I installed kexec-tools and edited /etc/kexec.conf (the ellipses are not in the file, I just removed some booring lines!):
Code: | # Kernel image pathname, relative from /boot.
KNAME="vmlinuz.new"
...
# --reuse-cmdline
# Use the current boot command line
...
KEXEC_OPT_ARGS="--reuse-cmdline" |
and /etc/conf.d/kexec:
Code: | # Load kexec kernel image into memory during shutdown instead of bootup
# (default: yes)
LOAD_DURING_SHUTDOWN="yes"
...
# Kernel image pathname, relative from BOOTPART.
...
KNAME="vmlinuz.new"
...
# Do not try to mount /boot
DONT_MOUNT_BOOT="yes" |
and started the kexec service.
Then I tried rebooting with
which from what I've read in the forums, and in the wiki, should automagically reboot using kexec. My rc.log file shows:
Code: | rc shutdown logging started at Sun Apr 14 16:47:09 2024
local | * Stopping local ...
...
alsasound | * Storing ALSA Mixer Levels ...
kexec | * Using kernel image /boot/vmlinuz.new for kexec ... [ ok ]
...
swap | * Deactivating swap devices ...
...
localmount | * Unmounting /home ...
[ ok ]
udev | * Stopping udev ...
[ ok ]
rc shutdown logging stopped at Sun Apr 14 16:47:10 2024 |
So it looks like the kexec service was invoked OK and didn't throw any errors.
But, of course, or I wouldn't be writing this, I got a common-or-garden reboot though BIOS and GRUB, with time for a sip of coffee or two, rather than the lightning fast reboot I was hoping for.
Yes, my kernel has:
Code: | CONFIG_KEXEC_CORE=y
CONFIG_KEXEC=y
CONFIG_KEXEC_FILE=y
# CONFIG_KEXEC_SIG is not set
|
_________________ Greybeard |
|
Back to top |
|
|
pingtoo Veteran
Joined: 10 Sep 2021 Posts: 1433 Location: Richmond Hill, Canada
|
Posted: Sun Apr 14, 2024 6:16 pm Post subject: |
|
|
Don't you need to use ? |
|
Back to top |
|
|
Goverp Advocate
Joined: 07 Mar 2007 Posts: 2200
|
Posted: Sun Apr 14, 2024 7:02 pm Post subject: |
|
|
Not according to the comments in the forums. "reboot -k" says you can't do it in this runlevel, for all runlevels other than 6, which you normally reach via the shutdown command (or one of the many aliases involved here). You can do "reboot -kf", but that chops the system off at the knees, leaving open files and all sorts of nastiness.
<edit>I know the wiki says that, but it doesn't work, and there's a discussion in this forum article.
What's supposed to happen is that, with the kexec-tools package installed and a kexec-enabled kernel, under OpenRc, you start the "kexec" service and then do a normal shutdown. That service runs during shutdown, and a hook in OpenRc (and possibly in sysv-init stuff, I'm trying unsuccessfully to wade my way through the twisty-turney maze of code here) is supposed to to the actual kexec thing once the system is nicely in bed and asleep - i.e. run level 6.
One possibility that's occurred to me is that it might be my use of 'rc_parallel="YES" in /etc/rc.conf - maybe the hook can't tell if kexec is required. I need to make a fairly trivial test... _________________ Greybeard |
|
Back to top |
|
|
pingtoo Veteran
Joined: 10 Sep 2021 Posts: 1433 Location: Richmond Hill, Canada
|
Posted: Sun Apr 14, 2024 8:13 pm Post subject: |
|
|
So, what about your /etc/inittab?
Does level 6 reboot have the -k option?
Code: | ...
si::sysinit:/sbin/openrc sysinit
# Further system initialization, brings up the boot runlevel.
rc::bootwait:/sbin/openrc boot
l0u:0:wait:/sbin/telinit u
l0:0:wait:/sbin/openrc shutdown
l0s:0:wait:/sbin/halt.sh
l1:1:wait:/sbin/openrc single
l2:2:wait:/sbin/openrc nonetwork
l3:3:wait:/sbin/openrc default
l4:4:wait:/sbin/openrc default
l5:5:wait:/sbin/openrc default
l6u:6:wait:/sbin/telinit u
l6:6:wait:/sbin/openrc reboot
l6r:6:wait:/sbin/reboot -dkn
#z6:6:respawn:/sbin/sulogin
# new-style single-user
su0:S:wait:/sbin/openrc single
...
|
|
|
Back to top |
|
|
Goverp Advocate
Joined: 07 Mar 2007 Posts: 2200
|
Posted: Mon Apr 15, 2024 9:30 am Post subject: |
|
|
Yup, same list. _________________ Greybeard |
|
Back to top |
|
|
pingtoo Veteran
Joined: 10 Sep 2021 Posts: 1433 Location: Richmond Hill, Canada
|
Posted: Mon Apr 15, 2024 11:38 am Post subject: |
|
|
Could it be that kernel failed load into memory before reboot?
Is your vmlinuz.new on a partition that got unmounted by the time the kexec try to load into memory? |
|
Back to top |
|
|
Goverp Advocate
Joined: 07 Mar 2007 Posts: 2200
|
Posted: Mon Apr 15, 2024 2:22 pm Post subject: |
|
|
OK, after some tests, I find rather usefully that issuing:
does what I hoped (get into runlevel 6) rather than what I expected (either shutdown or reboot before I could do anything).
So at this point we've a command line shell running, everything stopped, only rootfs mounted and that read-only. So actually, a safe place to play.
Code: | kexec -l /boot/vmlinuz.new --reuse-cmdline |
and variants to that effect work. (Adding the parameter "-d" gives a load of incomprehensible information to show it loaded something.)
So, there's a kernel loaded ready to reboot into. However:
does indeed reboot, but damnably into BIOS and GRUB. So something's not right with my kernel.
Same for:
Code: | kexec -f /boot/vmlinuz.new --reuse-cmdline |
As an aside,
still says
Quote: | ERROR: using -k at this runlevel requires also -f
(You probably want instead to reboot normally and let your reboot
script, usually /etc/init.d/reboot, specify -k) |
So &deity. alone know which runlevel it requires? None of the ones I can find! _________________ Greybeard |
|
Back to top |
|
|
pingtoo Veteran
Joined: 10 Sep 2021 Posts: 1433 Location: Richmond Hill, Canada
|
Posted: Mon Apr 15, 2024 2:58 pm Post subject: |
|
|
I think Only switch OpenRC system to "soft" level, the init (pid 1) did not switch.
I assume your "reboot" came from sysvinit package. You can fool the "reboot" by Code: | INIT_VERSION=1
RUNLEVEL=6
export INIT_VERSION RUNLEVEL
reboot -k
|
|
|
Back to top |
|
|
Hu Administrator
Joined: 06 Mar 2007 Posts: 23037
|
Posted: Mon Apr 15, 2024 3:10 pm Post subject: |
|
|
At the risk of diverting the thread, I want to note that for some initramfs testing, a throw-away qemu virtual machine with no attached network and no usable installed system can be a decent alternative. It can bring up the kernel and initramfs, run through the initramfs logic, and if you provide a dummy disk with appropriate partitions/LVM/LUKS (depending on what the initramfs wants), work through to where the initramfs tries to transfer control to the installed system. (It will likely fail at that point, but if you can reach the end of the initramfs, that may suffice to prove this is one you want to use for the real system.) This approach has the advantage of being very quick to iterate since it does not halt the original working system, so you can correct problems and rebuild readily. It has the disadvantage that you need to build some extra infrastructure to satisfy the initramfs's expectations about the environment. |
|
Back to top |
|
|
Goverp Advocate
Joined: 07 Mar 2007 Posts: 2200
|
Posted: Mon Apr 15, 2024 5:17 pm Post subject: |
|
|
Been reading the kexec kernel archives. There seems to be a regression in 6.7.6 on some AMD machines that introduces the observed behaviour. I'll pursue this avenue if poss. _________________ Greybeard |
|
Back to top |
|
|
pingtoo Veteran
Joined: 10 Sep 2021 Posts: 1433 Location: Richmond Hill, Canada
|
Posted: Mon Apr 15, 2024 7:00 pm Post subject: |
|
|
Goverp wrote: | Been reading the kexec kernel archives. There seems to be a regression in 6.7.6 on some AMD machines that introduces the observed behaviour. I'll pursue this avenue if poss. |
Will Hu's suggestion using a VM for test help debugging this easier? if the VM is emulating a non-AMD cpu (but still x86/x86-64) would it help to give the definitive answer of where the problem lying? |
|
Back to top |
|
|
pingtoo Veteran
Joined: 10 Sep 2021 Posts: 1433 Location: Richmond Hill, Canada
|
Posted: Mon Apr 15, 2024 7:13 pm Post subject: |
|
|
Goverp wrote: | Been reading the kexec kernel archives. There seems to be a regression in 6.7.6 on some AMD machines that introduces the observed behaviour. I'll pursue this avenue if poss. | Another thoughts, I saw in the kexec kernel archives suggest use early printk, do you think it will help? |
|
Back to top |
|
|
Goverp Advocate
Joined: 07 Mar 2007 Posts: 2200
|
Posted: Tue Apr 16, 2024 9:41 am Post subject: |
|
|
I'd have to install the various software layers for a VM. Early printk is probably the wrong end, I think the "kexec -c" call is failing (there's no output in syslog), and that's at the end of the reboot process, not the start of the next kernel.
The trouble is I've already spent a couple of days, say 8 hours, on this, to shave perhaps 20 seconds off my reboot time, so it already takes 1,440 reboots to recover the outlay. If I continue to work on this, it's for fame and glory, not profit!
<edit>Typo: I meant "kexec -e", not "kexec -c" above. _________________ Greybeard
Last edited by Goverp on Thu Apr 18, 2024 6:55 pm; edited 1 time in total |
|
Back to top |
|
|
Zucca Moderator
Joined: 14 Jun 2007 Posts: 3890 Location: Rasi, Finland
|
Posted: Thu Apr 18, 2024 4:34 pm Post subject: |
|
|
I guess I could test if this works with openrc-init -based system. _________________ ..: Zucca :..
My gentoo installs: | init=/sbin/openrc-init
-systemd -logind -elogind seatd |
Quote: | I am NaN! I am a man! |
|
|
Back to top |
|
|
Goverp Advocate
Joined: 07 Mar 2007 Posts: 2200
|
Posted: Thu Apr 18, 2024 6:53 pm Post subject: |
|
|
Zucca wrote: | I guess I could test if this works with openrc-init -based system. |
Thanks for the offer, but I doubt the init system is implicated. The "kexec -l" call works, and its debug output shows a kernel being loaded into storage. But an explicit "kexec -e" call from runlevel 6 reboots to BIOS not Linux for me. Of course, there's no output from that, or rather, it gets thrown away because in runlevel 6 there are no r/w resources!
I might try it with an older kernel to see if it is indeed the regression on AMD processors identified in the kexec mailing list. Been a bit too busy to try it over the last few days. _________________ Greybeard |
|
Back to top |
|
|
pingtoo Veteran
Joined: 10 Sep 2021 Posts: 1433 Location: Richmond Hill, Canada
|
Posted: Fri Apr 19, 2024 7:46 pm Post subject: |
|
|
Goverp,
Have you try to boot directly from /boot/vmlinuz.new? (I mean without kexec)? I don't recall if we eliminate that it just a bad kernel build case.
Do your system boot with UEFI?
It just occur to me that if your /boot/vmlinuz.new is a efi stub kernel, if that is true that may be the reason we are not able to kexec boot. I don't have evident yet (still research code) but I believe kexec does not support PE32 header, so it will not understand how to boot into a efi stub kernel. |
|
Back to top |
|
|
Goverp Advocate
Joined: 07 Mar 2007 Posts: 2200
|
Posted: Fri Apr 19, 2024 10:21 pm Post subject: |
|
|
pingtoo wrote: |
Have you try to boot directly from /boot/vmlinuz.new? (I mean without kexec)? I don't recall if we eliminate that it just a bad kernel build case.
|
All the time - it's what I'm using at this very second.
Quote: |
Do your system boot with UEFI?
|
Yes
Quote: |
It just occur to me that if your /boot/vmlinuz.new is a efi stub kernel, if that is true that may be the reason we are not able to kexec boot. I don't have evident yet (still research code) but I believe kexec does not support PE32 header, so it will not understand how to boot into a efi stub kernel. |
I'll give it a try without the EFI stub - it has one so that in extremis when GRUB breaks (as updates sometimes break it) I can boot with rEFInd or my BIOS's tools. I only added EFI support relatively recently when GRUB 2.06 AFAIR broke my system. (I should have left GRUB well alone, after all, it worked, what more is needed?) _________________ Greybeard |
|
Back to top |
|
|
Goverp Advocate
Joined: 07 Mar 2007 Posts: 2200
|
Posted: Sat Apr 20, 2024 9:28 am Post subject: |
|
|
OK, tried the same kernel without EFIstub, still no joy. Again, kexec -l claims to load it OK, but kexec -e reboots to BIOS. I also tried with an older kernel, 6.6.8 or thereabouts (I'm on a different machine just now), though that one didn't have kexec support. It's not clear to me if kexec support is needed in the rebooting kernel or the rebooted kernel or both ...
Anyway. more debugging required. _________________ Greybeard |
|
Back to top |
|
|
pingtoo Veteran
Joined: 10 Sep 2021 Posts: 1433 Location: Richmond Hill, Canada
|
Posted: Sat Apr 20, 2024 2:52 pm Post subject: |
|
|
Goverp wrote: | OK, tried the same kernel without EFIstub, still no joy. Again, kexec -l claims to load it OK, but kexec -e reboots to BIOS. I also tried with an older kernel, 6.6.8 or thereabouts (I'm on a different machine just now), though that one didn't have kexec support. It's not clear to me if kexec support is needed in the rebooting kernel or the rebooted kernel or both ...
Anyway. more debugging required. | Will, I had high hope this is it, from source code, I am about 90% sure the kexec system does not support boot a EFI binary. but I am reading Linux source in github master branch. I don't know if Gentoo have patched to support boot EFI with kexec.
If we going do debug this I will need some detail information so I can compare it with kernel source code to see what steps were executed to understand if there are missing configuration.
So if you can share your .config for both kernel will be great. Also the load with debug as in , Please also do Code: | file /boot/*<kernel-image-name>* | so we can be sure each kernel image file format.
If it is possible please also share dmesg after kexec -l ..., There should be something show about how the running kernel react to the load syscall.
P.S. Let's name the two kernel. I will can the current running kernel as "A" and the one that to be booted as "B".
As far as I can tell only the "A" kernel require kexec support, however if you wish to use "B" (i.e. once you kexec into "B") to also kexec into "A" (or "C") you should have in "B" |
|
Back to top |
|
|
pingtoo Veteran
Joined: 10 Sep 2021 Posts: 1433 Location: Richmond Hill, Canada
|
Posted: Sun Apr 21, 2024 12:01 am Post subject: |
|
|
After further study kernel code, Now I must correct my early statement about kexec not support EFI binary.
Actually kexec Do support EFI.
The only thing I have not yet to find out is how it work. All I can tell is it recognize a EFI binary in a bzImage format, I have just not yet find out how it find the kernel entry point in the bzImage, or if it is calling EFI service to perform reboot.
My apology for any confusion I made.
Goverp, I am sorry my suggestion lead to your extra work and time.
Govero, if you still willing to work with me, please continue do I suggest in my post for debugging. And I am OK and understand you feel last confident in me if you prefer not to. |
|
Back to top |
|
|
Goverp Advocate
Joined: 07 Mar 2007 Posts: 2200
|
Posted: Sun Apr 21, 2024 2:33 pm Post subject: |
|
|
Pingtoo,
No problem, I'm grateful for any help! I'll get the answers for you later - it's a good day for gardening today, so I'll be out in the sun for now. _________________ Greybeard |
|
Back to top |
|
|
Goverp Advocate
Joined: 07 Mar 2007 Posts: 2200
|
Posted: Mon Apr 22, 2024 3:43 pm Post subject: |
|
|
OK, here's my config, as a pastebin,
and here's the output from Code: | kexec -l /boot/vmlinuz.new --reuse-cmdline |
vmlinuz.new is a symbolic link, and file says of it:
Quote: | /boot/vmlinuz-6.8.7-git.new: Linux kernel x86 boot executable bzImage, version 6.8.7-git (packager@ryzen) #213 SMP Mon Apr 22 15:38:48 BST 2024, RO-rootFS, swap_dev 0XD, Normal VGA |
To sprinkle a little confusion over this, the kernel is compiled with clang and lto-thin and KCFLAGS="-march=native", but I built a clean kernel with pure gcc and no fancy KCFLAGS, still exactly the same result - kexec -e thinks for a bit, then reboots into BIOS and thence GRUB.
In all cases so far, A and B kernels are the same - i.e. the target of "kexec -l" is the kernel under which I'm running. I couldn't get any dmesg output from kexec -e, though I'll try again, and there's nothing in dmesg from kexec, or indeed anything from the reboot, though that's not surprising as nothing is writeable in runlevel 6. _________________ Greybeard |
|
Back to top |
|
|
Goverp Advocate
Joined: 07 Mar 2007 Posts: 2200
|
Posted: Mon Apr 22, 2024 4:03 pm Post subject: |
|
|
I tried a minor experiment, running "kexec -de" in single-user mode ("openrc single"). That causes a reboot rather than issuing a warning message saying "don't do that in this run mode", but (a) it was still to BIOS and GRUB, and (b) still nothing in syslog, though of course that's one of the services that wouldn't be running, even though my rootfs would still have been writeable. _________________ Greybeard |
|
Back to top |
|
|
pingtoo Veteran
Joined: 10 Sep 2021 Posts: 1433 Location: Richmond Hill, Canada
|
Posted: Mon Apr 22, 2024 6:55 pm Post subject: |
|
|
Goverp wrote: | OK, here's my config, as a pastebin,
and here's the output from Code: | kexec -l /boot/vmlinuz.new --reuse-cmdline |
vmlinuz.new is a symbolic link, and file says of it:
Quote: | /boot/vmlinuz-6.8.7-git.new: Linux kernel x86 boot executable bzImage, version 6.8.7-git (packager@ryzen) #213 SMP Mon Apr 22 15:38:48 BST 2024, RO-rootFS, swap_dev 0XD, Normal VGA |
To sprinkle a little confusion over this, the kernel is compiled with clang and lto-thin and KCFLAGS="-march=native", but I built a clean kernel with pure gcc and no fancy KCFLAGS, still exactly the same result - kexec -e thinks for a bit, then reboots into BIOS and thence GRUB.
In all cases so far, A and B kernels are the same - i.e. the target of "kexec -l" is the kernel under which I'm running. I couldn't get any dmesg output from kexec -e, though I'll try again, and there's nothing in dmesg from kexec, or indeed anything from the reboot, though that's not surprising as nothing is writeable in runlevel 6. |
I have a guess, your kernel seems compressed by zstd, but kexec-tool could only support gzip or lzma. so that may be the reason. So if you can try to rebuild kernel with gzip or lzma compress to see if this is the cause.
TL:DR, conflict information from debug log of kexec -l and .config
From .config you have CONFIG_KERNEL_ZSTD=y however the debug log show "Try gzip decompression." follow by correct information about the information of the kernel image, which from the source code point of view seems to be impossible, Because the code logic call "slurp_decompress_file()" to decompress and load the file into memory. And the "slurp_decompress_file()" first try to use zlib call to check the file then try to use lzma to read the file. from the log it seems to me the zlib have successfully open and read in some bytes and verified those bytes meet the expected signature, therefor it continue read to the end of file and perform decompress.
The most confusion part for me is that log show whole bunch of Code: | sym: sha256_starts info: 12 other: 00 shndx: 1 value: 11e0 size: 1f
sym: sha256_starts value: 82f2f81e0 addr: 82f2f7015
R_X86_64_64
,,, | That seem indicate successful decompress and read in the content, From my reading the code logic it seems should not happen. There may be something I miss.
Anyway, I hope you have the opportunity to try using gzip (or lzma) for kernel compress and use it with kexec -d -l ... to see if that make any different. |
|
Back to top |
|
|
Goverp Advocate
Joined: 07 Mar 2007 Posts: 2200
|
Posted: Tue Apr 23, 2024 11:22 am Post subject: |
|
|
pingtoo wrote: | ...
I have a guess, your kernel seems compressed by zstd, but kexec-tool could only support gzip or lzma. so that may be the reason. So if you can try to rebuild kernel with gzip or lzma compress to see if this is the cause.
...
That seem indicate successful decompress and read in the content, From my reading the code logic it seems should not happen. There may be something I miss.
... |
Good suggestion, but when I tried with Gzip, no change.
I'm not too surprised, I didn't think BIOS or EFI or GRUB decompressed the kernel, so I googled a bit and found the following in the wikipedia article on the vmlinux files:
Quote: | Traditionally, when creating a bootable kernel image, the kernel is also compressed using gzip, or, since Linux 2.6.30,[3] using LZMA or bzip2, which requires a very small decompression stub to be included in the resulting image. The stub decompresses the kernel code, on some systems printing dots to the console to indicate progress, and then continues the boot process.
...
The bzImage file is in a specific format. It contains concatenated bootsect.o + setup.o + misc.o + piggy.o.[8] piggy.o contains the gzipped vmlinux file in its data section. The script extract-vmlinux found under scripts/ in the kernel sources decompresses a kernel image. Some distributions (e.g. Red Hat and clones) may come with a kernel-debuginfo RPM that contains the vmlinux file for the matching kernel RPM, and it typically gets installed under /usr/lib/debug/lib/modules/`uname -r`/vmlinux or /usr/lib/debug/lib64/modules/`uname -r`/vmlinux. |
which means that the image loaded by GRUB, the EFI stub, or kexec -l, contains the the above three parts, and I guess setup.o decompresses piggy.o, not kexec.
I had a look again on the kexec mailing list archives; there's no follow-up as yet on the problem with AMD hardware and kexec reported last month. I'll try contacting the author, who has taken the discussion off-line to avoid polluting the kernel mailing lists with discussions of how to perform git bisect... _________________ Greybeard |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|