View previous topic :: View next topic |
Author |
Message |
nagmat84 Apprentice
Joined: 27 Mar 2007 Posts: 272
|
Posted: Sun Nov 17, 2024 10:29 am Post subject: Kernel doesn't load on Lenovo X1 Carbon 12th Gen (no output) |
|
|
I have a Lenovo X1 Carbon 12th Gen and the Linux kernel directly crashes immediately after loading. I probably missed some important kernel config, but I don't have any clue which one.
Configuration
Boot manager: refind (refind does load)
Kernel: 6.11.8 with EFI stub support
Partition layout: nvme0n1p1 (EFI System), ..., nvme01p4 (Linux Swap), nvme01p5 (Linux Root), nvme01p6 (Linux Home)
Output on console after invoking Gentoo from refind
Code: | Starting vmlinuz-6.11.8-gentoo.efi
Using load options 'udev.log_priority=5 emergency initrd=\EFI\Gentoo\initramfs-6.11.8.img'
EFI stub: Loaded initrd from command line option | After that: nothing (no disk activity, no flashing keyboard lights, nada, ...)
he kernel does not use any modules, everything is built-in statically. The initramfs does not contain anything critical but only contains the firmware for the Bluetooth adapter, wifi adapter and intel microcode. Some (not all) potentially relevant kernel settings Code: | CONFIG_FB=y
CONFIG_FB_EFI=y
CONFIG_FB_SIMPLE=y
CONFIG_DRM=y
CONFIG_DRM_FBDEV_EMULATION=y
CONFIG_DRM_I915=y
CONFIG_X86_VERBOSE_BOOTUP=y
CONFIG_EARLY_PRINTK=y
CONFIG_BLK_DEV_NVME=y
CONFIG_NVME_MULTIPATH=y
CONFIG_NVME_HWMON=y | Link to complete kernel config: https://pastebin.com/L7XDLUVf
Note, the recent Gentoo Minimal Install Image does actually boot in UEFI mode, so it is probably something wrong with my kernel. What am I missing?[/bug] |
|
Back to top |
|
|
pietinger Moderator
Joined: 17 Oct 2006 Posts: 5149 Location: Bavaria
|
Posted: Sun Nov 17, 2024 11:02 am Post subject: |
|
|
I can see from your kernel configuration that you have some experience with it. Everything is configured for the console and also for accessing the hard disk. But it could be that you have a special PCI controller. Boot again with the MinimalCD and do a “lspci -nnk” and check if you need the kernel module “vmd” (which is very often used in Intel machines). If so, this is exactly what is missing (# CONFIG_VMD is not set). (See also: https://wiki.gentoo.org/wiki/User:Pietinger/Experimental/Manual_Configuring_Current_Kernel#Accessing_the_Root_Partition )
Also missing:
Code: | # CONFIG_FB_EFI is not set
CONFIG_EXTRA_FIRMWARE=""
# CONFIG_X86_INTEL_LPSS is not set |
without this module these module (you have already enabled) will never work:
Code: | CONFIG_MFD_INTEL_LPSS=y
CONFIG_MFD_INTEL_LPSS_PCI=y |
Maybe you need also (depends on your machine):
Code: | # CONFIG_EFI_HANDOVER_PROTOCOL is not set |
_________________ https://wiki.gentoo.org/wiki/User:Pietinger |
|
Back to top |
|
|
nagmat84 Apprentice
Joined: 27 Mar 2007 Posts: 272
|
Posted: Sun Nov 17, 2024 11:13 am Post subject: |
|
|
I assume I do not need the kernel module "vmd". lspci from the installation media does not list it. Here ist the pastebin for lspci: https://pastebin.com/Y0JuUngQ |
|
Back to top |
|
|
pietinger Moderator
Joined: 17 Oct 2006 Posts: 5149 Location: Bavaria
|
Posted: Sun Nov 17, 2024 11:23 am Post subject: |
|
|
Uii ... a brand new MeteorLake ... please check with "dmesg | grep firmware" - after booting with our LIVECD (not the minimalCD; because of graphics driver) - needed firmware files, and put it into CONFIG_EXTRA_FIRMWARE. Does this help (together with my already mentioned changes)? _________________ https://wiki.gentoo.org/wiki/User:Pietinger |
|
Back to top |
|
|
pietinger Moderator
Joined: 17 Oct 2006 Posts: 5149 Location: Bavaria
|
|
Back to top |
|
|
nagmat84 Apprentice
Joined: 27 Mar 2007 Posts: 272
|
Posted: Sun Nov 17, 2024 11:44 am Post subject: |
|
|
I enabled CONFIG_VMD and CONFIG_EFI_HANDOVER_PROTOCOL to give it a try. No luck with that. (But I did expect that.)
I will try the Live Image. I am downloading ... However, please note that the Minimal Installation CD is able to boot and uses i915. So I doubt it is a firmware problem, but I will try anyway.
The minimal installation CD loaded the following FW files:- regulatory.db
- regulatory.db.p7s
- iwlwifi-ma-b0-gf-a0-83.ucode
- iwlwifi-ma-b0-gf-a0.pnvm
- iwl-debug-yoyo.bin
- intel/ibt-0180-0041.sfi
- intel/ibt-0180-0041.ddc
- intel/sof-ipc4/mtl/sof-mtl.ri
I already have put those FW into my initramfs except for iwl-debug-yoyo.bin, because I wasn't able to find out which Gentoo Package provides that one. Note that the minimal installation CD is able to use FB on i915 without any graphics FW. Hence, I expect the Live Image to show nothing else. |
|
Back to top |
|
|
pietinger Moderator
Joined: 17 Oct 2006 Posts: 5149 Location: Bavaria
|
Posted: Sun Nov 17, 2024 12:08 pm Post subject: |
|
|
nagmat84 wrote: | [...] Note that the minimal installation CD is able to use FB on i915 without any graphics FW. Hence, I expect the Live Image to show nothing else. |
Yes ... but you have enabled your i915 statically into the kernel (and then you need the firmware files also already in your kernel; your initramfs would be too late). The LiveCD has i915 as module and therefore can load the firmware files from /lib/firmware (because all modules which are NOT statically configured will be loaded/initialized AFTER the kernel has access to the root partition).
P.S.: The module CONFIG_X86_INTEL_LPSS is most important (because it initializes some chipsets). _________________ https://wiki.gentoo.org/wiki/User:Pietinger |
|
Back to top |
|
|
nagmat84 Apprentice
Joined: 27 Mar 2007 Posts: 272
|
Posted: Sun Nov 17, 2024 1:43 pm Post subject: |
|
|
I tried to other things, both without success:
I compiled my kernel with i915 as a module in case this would have been the culprit and loaded too early. There is no difference, the kernel hangs directly on or after loading the initramfs.
I booted from the Live CD. The module intel_vpu of Live CD tried to load some additional firmware "vpu_37xx.bin, "mtl_vpu.bin" and "intel/vpu/vpu_37xx_v0.0.bin". However this failed as the FW was not available. But this does not seem to be related to graphics. |
|
Back to top |
|
|
pietinger Moderator
Joined: 17 Oct 2006 Posts: 5149 Location: Bavaria
|
Posted: Sun Nov 17, 2024 4:24 pm Post subject: |
|
|
nagmat84 wrote: | [...] There is no difference, the kernel hangs directly on or after loading the initramfs. |
Hmmm ... problem is I have no experience with refind ... and maybe we must check the init-script of the initramfs ...
nagmat84 wrote: | I booted from the Live CD. The module intel_vpu [...] But this does not seem to be related to graphics. |
Yes, it is not related to graphics. VPU is the (old) modul name of Intels NPU (this is a CPU-integrated inference accelerator) ->
Code: | Device Drivers --->
[*] Compute Acceleration Framework --->
<*> Intel NPU (Neural Processing Unit) |
_________________ https://wiki.gentoo.org/wiki/User:Pietinger |
|
Back to top |
|
|
nagmat84 Apprentice
Joined: 27 Mar 2007 Posts: 272
|
Posted: Sun Nov 17, 2024 4:54 pm Post subject: |
|
|
pietinger wrote: |
Hmmm ... problem is I have no experience with refind ... and maybe we must check the init-script of the initramfs ...
|
I don't believe that the problem is related to refind. The problem remains the same when I try to boot the EFI stub directly from firmware. I have used refind several times now and never had those kinds of problems.
I installed a distribution kernel and can boot that one from refind with a bit more (but not full) success:- iwlwifi returns the error: "0000:00:14.3: Not valid error log pointer 0x002798A8 for RT uCode"
- A kernel bug with: "BUG kernel NULL pointer dereference; #PF: supervisor write access in kernel mode; #PF: error_code(0x0002) - not-present page". Immediately above those lines is a warning from "skl_hda_dsp_generic" which says "ASoC: Parent card not yet available, widget card binding deferred". However, I do not know if the kernel bug is related to that preceding line or if they are unrelated
- Immediately after that, there is an OOPS from udev-worker with "RIP: 0010:sof_ipc4_update_cpc_from_manifest+0x13a/0x5160 [snd_sof]
- Finally, SDDM (my display manager) does not start with the error "no display found"
But at least I am able to get a console.
pietinger wrote: |
Yes, it is not related to graphics. VPU is the (old) modul name of Intels NPU (this is a CPU-integrated inference accelerator) ->
Code: | Device Drivers --->
[*] Compute Acceleration Framework --->
<*> Intel NPU (Neural Processing Unit) |
| Fine, but the dmesg output still attribute it to the DRM (direct rendering manager). Is it possible that the failure to load the VPU firmware also prevents the graphics card from functioning? |
|
Back to top |
|
|
pietinger Moderator
Joined: 17 Oct 2006 Posts: 5149 Location: Bavaria
|
Posted: Sun Nov 17, 2024 10:23 pm Post subject: |
|
|
nagmat84 wrote: | [...] Is it possible that the failure to load the VPU firmware also prevents the graphics card from functioning? |
AFAIK: No. I know some settings without any VPU (NPU) module.
But if you have now a console we can now inspect more. I would like to see your current kernel .config and the complete output of "dmesg" and I would like to ask how you have created your kernel .config? (make oldconfig for a prevoius one, or completely new ... or edited an old one?) _________________ https://wiki.gentoo.org/wiki/User:Pietinger |
|
Back to top |
|
|
nagmat84 Apprentice
Joined: 27 Mar 2007 Posts: 272
|
Posted: Sun Nov 17, 2024 10:59 pm Post subject: |
|
|
This is not a direct answer to the previous post, because I was busy trying different things, but I made some progress.- gentoo-sources-6.11.8 with my custom configuration does not boot at all. This is variant which this post has been about originally and whose config is in the pastebin above.
- gentoo-kernel-6.6.58 (stable) with the configuration that is shipped with the ebuild produces the kernel null pointer error and the Oops which I wrote about in my previous posting. I don't get a graphical login, because sddm cannot find a usable display, but at least I get a console.
- genoo-kernel-6.11.8 (testing) with the configuration that is shipped with the ebuild simply works. There are no errors in dmesg, no firmware problems and I get a graphical output.
Note: Refind invokes all three kernel in the exact same way and all three kernels use initramfs which include the exact same set of files I assume that proofs that the problem actually lies within the kernel and is not a problem due to refind or some missing firmware files.
I guess one can assume that 6.6.58 might just be too old for my hardware. The null pointer exception within the kernel indicates a kernel bug. So let's forget that kernel version. I only accidentally emerged that, because I forgot to unmask ~amd64 for gentoo-kernel.
So the interesting question is what is the crucial difference between gentoo-sources-6.11.8 with my custom configuration and gentoo-kernel-6.11.8 with the distribution-provided configuration that makes the first one unbootable and the latter one working smoothly?
Quote: | I would like to see your current kernel .config and the complete output of "dmesg" and I would like to ask how you have created your kernel .config? |
My custom kernel config for gentoo-sources-6.11.8 which does not work is in the pastebin. The working config for gentoo-kernel-6.11.8 is the distribution provided configuration.
I have been using an older generation of the same Laptop model, i.e. Lenovo X1 Carbon 3rd gen which has already been running kernel 6.11.8. I took that configuration as a starting point for my new Laptop Lenovo X1 Carbon 12th gen, enabled the options for the new devices (USB4/Thunderbolt, NVMe, PCIE, different sound card, different Bluetooth adapter, touch input, ...) and disabled the options for the non-existing devices (i.e. different camera model, etc.)
I took the kernel config of my previous laptop as a starting point, because it was already tuned for an Intel-only, low-power system. |
|
Back to top |
|
|
pietinger Moderator
Joined: 17 Oct 2006 Posts: 5149 Location: Bavaria
|
Posted: Mon Nov 18, 2024 1:37 am Post subject: |
|
|
nagmat84 wrote: | This is not a direct answer to the previous post, because I was busy trying different things, but I made some progress.- gentoo-sources-6.11.8 with my custom configuration does not boot at all. This is variant which this post has been about originally and whose config is in the pastebin above.
- gentoo-kernel-6.6.58 (stable) with the configuration that is shipped with the ebuild produces the kernel null pointer error and the Oops which I wrote about in my previous posting. I don't get a graphical login, because sddm cannot find a usable display, but at least I get a console.
- genoo-kernel-6.11.8 (testing) with the configuration that is shipped with the ebuild simply works. There are no errors in dmesg, no firmware problems and I get a graphical output.
[...]
So the interesting question is what is the crucial difference between gentoo-sources-6.11.8 with my custom configuration and gentoo-kernel-6.11.8 with the distribution-provided configuration that makes the first one unbootable and the latter one working smoothly? |
I see these main differences:
1. Our gentoo dist-kernel has not this command line parameters:
Code: | CONFIG_CMDLINE="rd.vconsole.font=eurlatgr rd.vconsole.keymap=de-latin1 rd.locale.LANG=de_DE.UTF-8 init=/lib/systemd/systemd random.trust_cpu=on" |
(perhaps irrelevant)
2. Not these:
Code: | # CONFIG_X86_X2APIC is not set
# CONFIG_X86_INTEL_LPSS is not set
CONFIG_NR_CPUS=16 |
(perhaps irrelevant)
3. And our gentoo dist-kernel supports every file compression of the initramfs, while you have only:
Code: | CONFIG_INITRAMFS_SOURCE=""
# CONFIG_RD_GZIP is not set
# CONFIG_RD_BZIP2 is not set
# CONFIG_RD_LZMA is not set
# CONFIG_RD_XZ is not set
# CONFIG_RD_LZO is not set
CONFIG_RD_LZ4=y |
... so, are you sure your (used) initramfs is compressed with LZ4 (and not gzip) ?
BTW: I would disable CONFIG_SYSFB_SIMPLEFB=y and enable: # CONFIG_I2C_HID_ACPI is not set _________________ https://wiki.gentoo.org/wiki/User:Pietinger |
|
Back to top |
|
|
dmpogo Advocate
Joined: 02 Sep 2004 Posts: 3439 Location: Canada
|
Posted: Mon Nov 18, 2024 5:14 am Post subject: |
|
|
nagmat84 wrote: | This is not a direct answer to the previous post, because I was busy trying different things, but I made some progress.- gentoo-sources-6.11.8 with my custom configuration does not boot at all. This is variant which this post has been about originally and whose config is in the pastebin above.
- gentoo-kernel-6.6.58 (stable) with the configuration that is shipped with the ebuild produces the kernel null pointer error and the Oops which I wrote about in my previous posting. I don't get a graphical login, because sddm cannot find a usable display, but at least I get a console.
- genoo-kernel-6.11.8 (testing) with the configuration that is shipped with the ebuild simply works. There are no errors in dmesg, no firmware problems and I get a graphical output.
|
I have the same machine and it is still in the box My understanding was that Intel Core Ultra support in 6.6 is still iffy. As I see 'Intel Meteor Lake Graphics declared stable' only in 6.7, but 6.9 still had work on that continuing |
|
Back to top |
|
|
nagmat84 Apprentice
Joined: 27 Mar 2007 Posts: 272
|
Posted: Mon Nov 18, 2024 7:11 am Post subject: |
|
|
Yes, I am absolutely sure that the initramfs is compressed with LZ4. I manually unpacked all three initramfs with cpio and LZ4 to check if and how they differ. (They don't.) And there should be an early kernel error if the kernel wasn't able to decompress the initramfs. Shouldn't it?
I don't believe the embedded kernel command line is the problem. It's only for localization.
We have to wait until next weekend until I can do more tests. I don't have the laptop with me during the workweek.
One other difference between the distribution kernel and the custom kernel are the IIO-related config options (IIO = industrial input output?). For what are those? Can this be the culprit?
Last edited by nagmat84 on Mon Nov 18, 2024 4:58 pm; edited 1 time in total |
|
Back to top |
|
|
pietinger Moderator
Joined: 17 Oct 2006 Posts: 5149 Location: Bavaria
|
Posted: Mon Nov 18, 2024 3:54 pm Post subject: |
|
|
nagmat84 wrote: | [...] One other difference between the distribution kernel and the custom kernel are the IIO-related config options (IIO = industrial input output?). For what are those? Can this be the culprit? |
I dont think so. But you can check it: Boot with our dist-kernel and check the loaded modules with "lsmod". If they do not appear there, they are definitely not necessary. _________________ https://wiki.gentoo.org/wiki/User:Pietinger |
|
Back to top |
|
|
nagmat84 Apprentice
Joined: 27 Mar 2007 Posts: 272
|
Posted: Tue Nov 19, 2024 9:57 pm Post subject: |
|
|
pietinger wrote: | BTW: I would disable CONFIG_SYSFB_SIMPLEFB=y and enable: # CONFIG_I2C_HID_ACPI is not set | I will enable CONFIG_I2C_HID_ACPI as soon as I get back to the laptop. Your advice regarding CONFIG_SYSFB_SIMPLEFB puzzles me. But this is should be discussed in a separate thread: https://forums.gentoo.org/viewtopic-t-1171808.html |
|
Back to top |
|
|
nagmat84 Apprentice
Joined: 27 Mar 2007 Posts: 272
|
Posted: Sun Nov 24, 2024 6:41 pm Post subject: |
|
|
I am lost and hit a dead end. I took the distribution-provided kernel configuration for 6.11.8 from /etc/portage/savedconfig, copied that configuration to the kernel sources for 6.12.1, ran "make oldconfig" and compiled gentoo-sources-6.12.1 myself. I ran "make all && make modules_install && make install" and everything was fine. The kernel booted. So rEFInd and my Dracut configiuration are definitely fine, too.
Then, I took my custom kernel config and ran "make oldconfig". Then I incorporated all the tips from here (i.e. enabled CONFIG_I2C_HID_ACPI, disabled CONFIG_SYSFB_SIMPLEFB, etc.). Additionally I inspected lspci, usb-devices and lsmod what else is loaded by the kernel with the distribution-provided configuration. I ran Code: | egrep --recursive --include='Makefile' -i -e 'CONFIG_.* <paste-module-name-here>\.o' . | to figure out which configuration options my custom kernel might be missing and enabled all those, too. Additionally, I also built all firmware files into the kernel just as a safety measure. The result: I get an unbootable kernel which does not even print a single line on the console. WTF?!?!?
Here are some pastebins:- Working self-compiled gentoo-sources-6.12.1 created from distribution-provided gentoo-kernel-6.11.8
- Non-booting custom, self-compiled gentoo-sources-6.12.1: Kernel config
|
|
Back to top |
|
|
pietinger Moderator
Joined: 17 Oct 2006 Posts: 5149 Location: Bavaria
|
Posted: Sun Nov 24, 2024 8:00 pm Post subject: |
|
|
These MUST be activated:
Code: | # CONFIG_X86_X2APIC is not set
# CONFIG_X86_INTEL_LPSS is not set |
(X86_INTEL_LPSS I already mentioned as very important ...)
To be on a safe side, I recommend (just do it):
Code: | 1.
# CONFIG_CPUSETS_V1 is not set
# CONFIG_EFI_HANDOVER_PROTOCOL is not set
2.
CONFIG_CMDLINE_BOOL=y
CONFIG_X86_INTEL_TSX_MODE_AUTO=y
CONFIG_X86_ACPI_CPUFREQ=y
3.
CONFIG_MODULE_SIG=y
4.
# CONFIG_AGP is not set
2.
CONFIG_DRM_XE=m
1.
# CONFIG_FB_VESA is not set |
1. Enable it
2. Disable it
3. Disable it - I want to be on a safe side (when everything works, we can switch back ONE option at a time)
4. Enable it and enable then Intel
Do all changes in "make menuconfig" - NOT in an editor. _________________ https://wiki.gentoo.org/wiki/User:Pietinger |
|
Back to top |
|
|
dmpogo Advocate
Joined: 02 Sep 2004 Posts: 3439 Location: Canada
|
Posted: Sun Nov 24, 2024 9:53 pm Post subject: |
|
|
pietinger wrote: |
To be on a safe side, I recommend (just do it):
[code]1.
4.# CONFIG_AGP is not set
4. Enable it and enable then Intel
|
This is what interested me for a while, why enabling AGP on the machine with no AGP bus - what is it needed for ? I actually do indeed have it enabld on intel laptop, but not on nividia machines |
|
Back to top |
|
|
pietinger Moderator
Joined: 17 Oct 2006 Posts: 5149 Location: Bavaria
|
Posted: Mon Nov 25, 2024 12:31 am Post subject: |
|
|
dmpogo wrote: | This is what interested me for a while, why enabling AGP on the machine with no AGP bus - what is it needed for ? |
Yes, usually it is only necessary for (very) old systems ... but it would just be too embarrassing for me if I didn't list something and in the end it would be necessary (I've been in IT too long and have experienced too many “impossible” things ). _________________ https://wiki.gentoo.org/wiki/User:Pietinger |
|
Back to top |
|
|
|