View previous topic :: View next topic |
Author |
Message |
jfo n00b
Joined: 07 Sep 2015 Posts: 60
|
Posted: Fri Mar 29, 2024 9:25 pm Post subject: [solved]DELL Precision 5820/NVIDIA cant boot EFI stub kernel |
|
|
Hello -
I have been trying to install gentoo on a brand new DELL Precision 5820. So far all my attempts have only led to frustration.
I have years of experience with Gentoo. I own another machine (Asrock X99 WS MB + nvidia RTX3060) and I never experienced
issues with UEFI and booting (the firmware is from 2013 !) . I am hoping someone can help me figure out what might be wrong
or come up with helpful suggestions because I am at my wits end.
I want to build a kernel that does not rely on initrd. My understanding is that a ramdisk should not be needed as long as all the drivers
required for booting (e.g. block devices, filesystems etc ) are compiled into into the kernel.
No matter what I do, the system hangs immediately and I never see any output. The screen is black. There seems to be a brief amount of disk activity
but I have no way of knowing for sure whether the boot has started. Initially I suspected a framebuffer issue ... but with basically the same configuration
I have no issue on my older machine.
Here a summary of what I did
- Disabled Secure boot & CSM
- read and carefully followed the recommended settings in the Gentoo Handbook / as well as the EFI stub and nvidia-drivers.
- made sure that only EFI framebuffer and simple framebuffer are enabled
- the nvidia card on that machine is identified as TU117 (Turing). I emerged nvidia-drivers-470.x
- all block devices filesystems etc .. compiled in
- intel microcode is compiled in
- checked and rechecked /etc/fstab and I believe it is correct.
/ is mounted on /dev/nvme0n1p4 and EFS partition is on /dev/nvme0n1p1
running mount -a in the chroot env returns no error.
- EFS partition has correct type (correctly seen by gparted)
- I used efibootmgr to make a boot entry
Code: |
efibootmgr -c -d /dev/nvme0n1p1 -p 1 -L "Gentoo" -l "\efi\Gentoo\vmlinuz-6.8.1-gentoo.efi" -u "root=/dev/nvme0n1p4 ro"
|
- I verified that the entry is correct using
- when I tried to boot I got a blank screen and a 2 line error message from the EFI. The message said: [Firmware Bug] ... forgot the exact wording
- after a bit of research I found this: https://bbs.archlinux.org/viewtopic.php?pid=1753169#p1753169
Quote: |
The UEFI implementation of the Dell XPS models is basically broken; any arguments you set on a bootentry (efibootmgr's -u/--unicode) will be ignored.
As a result you will have a boot entry for it but no data will ever be passed to the loader so it will never know your initrd or any other settings.
|
The quote is from 2017 (!) ... but it seems that this is still true. It is hard to believe that this is not fixed in 2024. Given how common DELL systems are
it is also surprising that there is nothing anywhere in the Gentoo documentation/wikis to warn people of this specific issue with DELL.
- I saw that in recent kernels, it is possible to compile in kernel options - may be this was introduced specifically to deal with broken UEFI implementations .
So I compiled a built-in command line as follows (I tried using using the device name i.e. /dev/nvme0n1p4 and the corresponding PARTUUID .. it made no difference)
Code: |
[*] Built-in kernel command line
(root=PARTUUID=fa240661-0993-4186-8e98-a6e5273455dd ro)
|
and enabled
Code: |
[*] Built-in command line overrides boot loader arguments
|
My understanding is that when this is compiled in, the argument string -u "root=/dev/nvme0n1p4 ro" is no longer needed by efibootmgr.
I verified on my old machine that if the kernel options are compiled in, things work as expected i.e. the -u argument is no longer needed.
With the new DELL, the UEFI error message is gone - but I get a completely black screen.
- I tried using rEFInd, and the results are the same. In any event, it is not clear to me what mechanism reFIND uses to pass kernel options.
i.e. is it any different than using -u with efibootmgr ? (in which case the options would also be ignored)
I tried rEFInd with and without compiled options and there is no difference.
Just in case i missed something, here are some relevant pieces of my kernel .config
Code: |
CONFIG_X86_16BIT=y
CONFIG_X86_ESPFIX64=y
CONFIG_X86_VSYSCALL_EMULATION=y
CONFIG_X86_IOPL_IOPERM=y
CONFIG_MICROCODE=y
# CONFIG_MICROCODE_LATE_LOADING is not set
CONFIG_X86_MSR=y
CONFIG_X86_CPUID=y
CONFIG_X86_5LEVEL=y
CONFIG_X86_DIRECT_GBPAGES=y
# CONFIG_X86_CPA_STATISTICS is not set
# CONFIG_AMD_MEM_ENCRYPT is not set
CONFIG_NUMA=y
CONFIG_AMD_NUMA=y
CONFIG_X86_64_ACPI_NUMA=y
# CONFIG_NUMA_EMU is not set
CONFIG_NODES_SHIFT=6
CONFIG_ARCH_SPARSEMEM_ENABLE=y
CONFIG_ARCH_SPARSEMEM_DEFAULT=y
CONFIG_ARCH_PROC_KCORE_TEXT=y
CONFIG_ILLEGAL_POINTER_VALUE=0xdead000000000000
# CONFIG_X86_PMEM_LEGACY is not set
CONFIG_X86_CHECK_BIOS_CORRUPTION=y
CONFIG_X86_BOOTPARAM_MEMORY_CORRUPTION_CHECK=y
CONFIG_MTRR=y
CONFIG_MTRR_SANITIZER=y
CONFIG_MTRR_SANITIZER_ENABLE_DEFAULT=0
CONFIG_MTRR_SANITIZER_SPARE_REG_NR_DEFAULT=1
CONFIG_X86_PAT=y
CONFIG_ARCH_USES_PG_UNCACHED=y
CONFIG_X86_UMIP=y
CONFIG_CC_HAS_IBT=y
CONFIG_X86_CET=y
CONFIG_X86_KERNEL_IBT=y
CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS=y
CONFIG_X86_INTEL_TSX_MODE_OFF=y
# CONFIG_X86_INTEL_TSX_MODE_ON is not set
# CONFIG_X86_INTEL_TSX_MODE_AUTO is not set
# CONFIG_X86_USER_SHADOW_STACK is not set
CONFIG_EFI=y
CONFIG_EFI_STUB=y
# CONFIG_EFI_HANDOVER_PROTOCOL is not set
# CONFIG_EFI_MIXED is not set
# CONFIG_EFI_FAKE_MEMMAP is not set
CONFIG_EFI_RUNTIME_MAP=y
# CONFIG_HZ_100 is not set
# CONFIG_HZ_250 is not set
# CONFIG_HZ_300 is not set
CONFIG_HZ_1000=y
CONFIG_HZ=1000
CONFIG_SCHED_HRTICK=y
CONFIG_ARCH_SUPPORTS_KEXEC=y
CONFIG_ARCH_SUPPORTS_KEXEC_FILE=y
CONFIG_ARCH_SUPPORTS_KEXEC_PURGATORY=y
CONFIG_ARCH_SUPPORTS_KEXEC_SIG=y
CONFIG_ARCH_SUPPORTS_KEXEC_SIG_FORCE=y
CONFIG_ARCH_SUPPORTS_KEXEC_BZIMAGE_VERIFY_SIG=y
CONFIG_ARCH_SUPPORTS_KEXEC_JUMP=y
CONFIG_ARCH_SUPPORTS_CRASH_DUMP=y
CONFIG_ARCH_SUPPORTS_CRASH_HOTPLUG=y
CONFIG_ARCH_HAS_GENERIC_CRASHKERNEL_RESERVATION=y
CONFIG_PHYSICAL_START=0x1000000
CONFIG_RELOCATABLE=y
CONFIG_RANDOMIZE_BASE=y
CONFIG_X86_NEED_RELOCS=y
CONFIG_PHYSICAL_ALIGN=0x200000
CONFIG_DYNAMIC_MEMORY_LAYOUT=y
CONFIG_RANDOMIZE_MEMORY=y
CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING=0x0
# CONFIG_ADDRESS_MASKING is not set
CONFIG_HOTPLUG_CPU=y
# CONFIG_COMPAT_VDSO is not set
CONFIG_LEGACY_VSYSCALL_XONLY=y
# CONFIG_LEGACY_VSYSCALL_NONE is not set
CONFIG_CMDLINE_BOOL=y
CONFIG_CMDLINE="root=PARTUUID=fa240661-0993-4186-8e98-a6e5273455dd ro"
CONFIG_CMDLINE_OVERRIDE=y
CONFIG_MODIFY_LDT_SYSCALL=y
# CONFIG_STRICT_SIGALTSTACK_SIZE is not set
CONFIG_HAVE_LIVEPATCH=y
# end of Processor type and features
Frame buffer Devices
#
CONFIG_FB=y
# CONFIG_FB_CIRRUS is not set
# CONFIG_FB_PM2 is not set
# CONFIG_FB_CYBER2000 is not set
# CONFIG_FB_ARC is not set
# CONFIG_FB_ASILIANT is not set
# CONFIG_FB_IMSTT is not set
# CONFIG_FB_VGA16 is not set
# CONFIG_FB_UVESA is not set
# CONFIG_FB_VESA is not set
CONFIG_FB_EFI=y
# CONFIG_FB_N411 is not set
# CONFIG_FB_HGA is not set
# CONFIG_FB_OPENCORES is not set
# CONFIG_FB_S1D13XXX is not set
# CONFIG_FB_NVIDIA is not set
# CONFIG_FB_RIVA is not set
# CONFIG_FB_I740 is not set
# CONFIG_FB_MATROX is not set
# CONFIG_FB_RADEON is not set
# CONFIG_FB_ATY128 is not set
# CONFIG_FB_ATY is not set
# CONFIG_FB_S3 is not set
# CONFIG_FB_SAVAGE is not set
# CONFIG_FB_SIS is not set
# CONFIG_FB_NEOMAGIC is not set
# CONFIG_FB_KYRO is not set
# CONFIG_FB_3DFX is not set
# CONFIG_FB_VT8623 is not set
# CONFIG_FB_TRIDENT is not set
# CONFIG_FB_ARK is not set
# CONFIG_FB_PM3 is not set
# CONFIG_FB_CARMINE is not set
# CONFIG_FB_SMSCUFX is not set
# CONFIG_FB_IBM_GXT4500 is not set
# CONFIG_FB_VIRTUAL is not set
# CONFIG_FB_METRONOME is not set
# CONFIG_FB_MB862XX is not set
CONFIG_FB_SIMPLE=y
# CONFIG_FB_SM712 is not set
CONFIG_FB_CORE=y
CONFIG_FB_NOTIFY=y
# CONFIG_FIRMWARE_EDID is not set
# CONFIG_FB_DEVICE is not set
CONFIG_FB_CFB_FILLRECT=y
CONFIG_FB_CFB_COPYAREA=y
CONFIG_FB_CFB_IMAGEBLIT=y
CONFIG_FB_SYS_FILLRECT=y
CONFIG_FB_SYS_COPYAREA=y
CONFIG_FB_SYS_IMAGEBLIT=y
# CONFIG_FB_FOREIGN_ENDIAN is not set
CONFIG_FB_SYSMEM_FOPS=y
CONFIG_FB_DEFERRED_IO=y
CONFIG_FB_IOMEM_FOPS=y
CONFIG_FB_IOMEM_HELPERS=y
CONFIG_FB_SYSMEM_HELPERS=y
CONFIG_FB_SYSMEM_HELPERS_DEFERRED=y
# CONFIG_FB_MODE_HELPERS is not set
# CONFIG_FB_TILEBLITTING is not set
# end of Frame buffer Devices
|
Last edited by jfo on Tue Apr 02, 2024 1:24 am; edited 1 time in total |
|
Back to top |
|
|
pietinger Moderator
Joined: 17 Oct 2006 Posts: 5118 Location: Bavaria
|
|
Back to top |
|
|
pietinger Moderator
Joined: 17 Oct 2006 Posts: 5118 Location: Bavaria
|
Posted: Fri Mar 29, 2024 10:45 pm Post subject: |
|
|
P.S.:
jfo wrote: | [...] My understanding is that when this is compiled in, the argument string -u "root=/dev/nvme0n1p4 ro" is no longer needed by efibootmgr.
I verified on my old machine that if the kernel options are compiled in, things work as expected i.e. the -u argument is no longer needed.
With the new DELL, the UEFI error message is gone - but I get a completely black screen.
- I tried using rEFInd, and the results are the same. In any event, it is not clear to me what mechanism reFIND uses to pass kernel options.
i.e. is it any different than using -u with efibootmgr ? (in which case the options would also be ignored)
I tried rEFInd with and without compiled options and there is no difference. |
If you would use the wrong kernel command line paramter root= (+ no other problems) you would get a kernel panic ... so I dont think it is the command line parm. I guess a kernel problem (now - after resolving the DELL parameter problem). _________________ https://wiki.gentoo.org/wiki/User:Pietinger |
|
Back to top |
|
|
jfo n00b
Joined: 07 Sep 2015 Posts: 60
|
|
Back to top |
|
|
pietinger Moderator
Joined: 17 Oct 2006 Posts: 5118 Location: Bavaria
|
Posted: Sat Mar 30, 2024 11:54 am Post subject: |
|
|
jfo wrote: | (1) CONFIG_FRAMEBUFFER_CONSOLE is not set. |
This is the cause of your black screen.
You have an Intel CPU: CONFIG_MCORE2=y and I have seen this:
Code: | # CONFIG_X86_CPU_RESCTRL is not set
# CONFIG_X86_INTEL_LPSS is not set
CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE=y
# CONFIG_INTEL_IDLE is not set
# CONFIG_TRANSPARENT_HUGEPAGE is not set
# CONFIG_LRU_GEN is not set
# CONFIG_USB_UAS is not set
# CONFIG_INTEL_IOMMU_SVM is not set
# CONFIG_INTEL_IOMMU_DEFAULT_ON is not set
# CONFIG_IRQ_REMAP is not set |
Maybe you want read my complete article: https://wiki.gentoo.org/wiki/User:Pietinger/Tutorials/Manual_Configuring_Kernel_Version_6.6 _________________ https://wiki.gentoo.org/wiki/User:Pietinger |
|
Back to top |
|
|
jfo n00b
Joined: 07 Sep 2015 Posts: 60
|
Posted: Sat Mar 30, 2024 1:37 pm Post subject: |
|
|
Thank you again. I will certainly take a look at your article.
Unfortunately I do not have access to the computer at the moment. I will report back later to confirm whether or not
CONFIG_FRAMEBUFFER_CONSOLE was the culprit ... although it does look like I missed this essential setting.
As for the rest of the configuration, there was no attempt to optimize anything. My goal was to get the machine to boot first.
I certainly wasted way too much time researching the quirks of the DELL UEFI .
In the meantime I found that this reference provides a clear and useful explanation of the distinction between the framebuffer console device and the framebuffer driver:
https://docs.kernel.org/fb/fbcon.html
Last edited by jfo on Sat Mar 30, 2024 4:15 pm; edited 1 time in total |
|
Back to top |
|
|
pietinger Moderator
Joined: 17 Oct 2006 Posts: 5118 Location: Bavaria
|
|
Back to top |
|
|
jfo n00b
Joined: 07 Sep 2015 Posts: 60
|
Posted: Tue Apr 02, 2024 1:23 am Post subject: |
|
|
pietinger:
I confirm that enabling console framebuffer did the trick. Thank you again. |
|
Back to top |
|
|
pietinger Moderator
Joined: 17 Oct 2006 Posts: 5118 Location: Bavaria
|
Posted: Tue Apr 02, 2024 9:26 am Post subject: |
|
|
jfo wrote: | I confirm that enabling console framebuffer did the trick. Thank you again. |
Thank you for the feedback and you are very Welcome !
Have fun with Gentoo ! _________________ https://wiki.gentoo.org/wiki/User:Pietinger |
|
Back to top |
|
|
|