View previous topic :: View next topic |
Author |
Message |
AECFXI n00b
Joined: 28 Nov 2018 Posts: 33
|
Posted: Thu Aug 19, 2021 12:22 pm Post subject: NVMe drive not booting kernel from (u)EFI |
|
|
I'm migrating my Gentoo installation from an older SATA-connected SSD to a newer NVME M.2 connected SSD. The NVMe root partition is an F2FS filesystem type. I am also migrating from GRUB2 BIOS to GRUB UEFI or rEFInd, or an EFI stub. The original intent was rEFInd specifically.
I have NVME enabled in the kernel, as well as F2FS, the EFI system partition and the steps for EFI stub support. My kernel is 5.10.52.
I have the new drive partitioned appropriately, with a FAT32 (appropriate per rEFInd) partition /dev/nvme0n1p1 that has the boot and efi partition labels enabled. I copied my gentoo install over from the old SSD onto /dev/nvme0n1p3 with a cp -a, and updated the fstab to reflect the new partitions.
Booting into the "SystemRescue" live environment, my new system root directory /dev/nvme0n1p3 is detected and I can enter into it from the live environment. Although KDE isn't working, it seems to be an otherwise functional gentoo system and I installed GRUB UEFI from within it.
I've been able to identify both rEFInd and the GRUB UEFI in my bios configuration and I've been able to enter both the rEFInd and grub's UEFI environments. They both correctly locate my kernel and I can navigate options, etc.
However neither rEFInd and GRUB UEFI can boot the kernel and enter the /dev/nvme0n1p3 root directory. In GRUB UEFI it hangs at "Loading Linux 5.10.52-gentoo...". In rEFInd, it hangs after a nearly identical prompt including the text "using load options", and shows my /dev/nvme0n1p3 address as the intended root directory as well as the kernel image being used.
I am at a loss for options here. This is the behavior I would expect from my kernel if I didn't nave NVME or F2FS support enabled, but support is enabled. Clearly there is no problem with the motherboard's bios detecting the UEFI and EFI boot information on the NVME drive and starting up those environments. Clearly the problem isn't a unqiue rEFInd or GRUB UEFI issue as both environments demonstrate identical behavior. And clearly my gentoo install on /dev/nvme0n1p3 is reasonably operational as I can enter a working gentoo system on that partition from the SystemRescue live system. F2FS shows every indication of being a reasonably supported filesystem, and this is exactly the use case I would expect for it: having the base system on a high speed NAND drive running F2FS.
Let me know what logs or other information I can upload to help get support.
Last edited by AECFXI on Fri Aug 20, 2021 12:55 am; edited 1 time in total |
|
Back to top |
|
|
Jaglover Watchman
Joined: 29 May 2005 Posts: 8291 Location: Saint Amant, Acadiana
|
|
Back to top |
|
|
AECFXI n00b
Joined: 28 Nov 2018 Posts: 33
|
Posted: Thu Aug 19, 2021 2:25 pm Post subject: |
|
|
Jaglover wrote: | Quote: | However neither rEFInd and GRUB UEFI can boot the kernel and enter the /dev/nvme0n1p3 root directory. |
Have you tried putting your kernel to ESP partition? |
My filesystem is /dev/nvme0n1p1 as the boot,EFI labeled partition, and it is mounted to /boot. It's just that, swap and the root filesystem.
The kernel is located in /boot/EFI/Gentoo for use in rEFInd and just plain /boot for GRUB. In rEFInd, I have a refind_linux.conf file that I wrote to replicate the one displayed in the rEFInd gentoo wiki. All working as expected with no errors being outputted, just not entering the kernel/root partition. |
|
Back to top |
|
|
Jaglover Watchman
Joined: 29 May 2005 Posts: 8291 Location: Saint Amant, Acadiana
|
Posted: Thu Aug 19, 2021 2:41 pm Post subject: |
|
|
I wonder what you mean by "entering root". Bootloader locates the kernel and attempts to load it. From your problem description it seems it hangs while trying to load the kernel, therefore it seems to me like there is some trouble reading the kernel image. You sure this kernel is good? And the filesystem is not corrupted in ESP? Once the kernel is loaded it runs, initializes the hardware and finally attempts to access root, but yours never gets that far. _________________ My Gentoo installation notes.
Please learn how to denote units correctly! |
|
Back to top |
|
|
AECFXI n00b
Joined: 28 Nov 2018 Posts: 33
|
Posted: Thu Aug 19, 2021 3:07 pm Post subject: |
|
|
Jaglover wrote: | I wonder what you mean by "entering root". Bootloader locates the kernel and attempts to load it. From your problem description it seems it hangs while trying to load the kernel, therefore it seems to me like there is some trouble reading the kernel image. You sure this kernel is good? And the filesystem is not corrupted in ESP? Once the kernel is loaded it runs, initializes the hardware and finally attempts to access root, but yours never gets that far. |
Ah, ok, this helps me understand a little bit better the order of events that happen during this particular time in the boot process. The output during boot always flies by me too fast to read when everything is operating normally.
As this is simply a move of a working system from one drive to another, I am using the very same kernel that was running my Gentoo system on my previous SSD with an XFS root filesystem that ran fine. I used multiple methods to place the kernel in /boot/.., using cp to move kernel files off my old /boot partiation into the rEFInd directory, and also having "make install" place fresh kernel images directly into new /boot for use by UEFI. It's hard to conceive of the filesystem being corrupt when I just made it in fdisk minutes earlier. I will try some checking tools, but I will also probably nuke it for an ext2 boot partition to try a traditional grub bios boot approach.
The SystemRescue live environment identifies the root linux directory /dev/nvme0n1p3 and allows me boot into it, but I wonder if its just booting into that root directory with a rescue kernel? (EDIT: it is.)
Last edited by AECFXI on Fri Aug 20, 2021 1:11 am; edited 1 time in total |
|
Back to top |
|
|
pietinger Moderator
Joined: 17 Oct 2006 Posts: 4236 Location: Bavaria
|
Posted: Thu Aug 19, 2021 3:27 pm Post subject: NVMe drive not booting kernel and entering root from (u)EFI |
|
|
AECFXI wrote: | However neither rEFInd and GRUB UEFI can boot the kernel and enter the /dev/nvme0n1p3 root directory. |
refind and grub2 dont need your root-partition; they need access to your boot-partition /dev/nvme0n1p1 ... this is a FAT32 ...
AECFXI wrote: | [...] In GRUB UEFI it hangs at "Loading Linux 5.10.52-gentoo...". In rEFInd, it hangs after a nearly identical prompt including the text "using load options", and shows my /dev/nvme0n1p3 address as the intended root directory as well as the kernel image being used.
[...] Clearly the problem isn't a unqiue rEFInd or GRUB UEFI issue as both environments demonstrate identical behavior. [...] |
Dont say this ...
I dont think its the kernel, because if grub2 (or refind) would be able to load (and start) the kernel AND IF there would be a problem in the kernel, you would just get a "kernel panic". IMHO there is a problem with getting the kernel-file - for both - grub2 and refind ... maybe two wrong configurations ... (target filesystem !) ? Here I cant help because I am using a stub kernel.
Last question: Do you use initramfs for your kernel ? |
|
Back to top |
|
|
Tony0945 Watchman
Joined: 25 Jul 2006 Posts: 5127 Location: Illinois, USA
|
Posted: Thu Aug 19, 2021 4:48 pm Post subject: |
|
|
Please post your refind.conf and refind-linux.conf. IMO, placing kernels on FAT is a bad idea, but should be possible (they may have become corrupted).
You also need UEFI support in the kernels. I tripped up on this going from non-EFI to EFI.
On my system these files are:
/boot/efi/EFI/refind/refind.conf
and /boot/efi/EFI/refind/refind.conf
Use separate pastebins, please.
I'm thinking bad boot parameters.
EDIT:
I mount the FI partition as /boot/efi
It's a good sign that refind attempts to load the jernel. That means it found it. |
|
Back to top |
|
|
AECFXI n00b
Joined: 28 Nov 2018 Posts: 33
|
Posted: Fri Aug 20, 2021 12:46 am Post subject: |
|
|
Re: corrupted vfat partition, fsck.vfat reveals:
- Differences between boot sector and its backup
- 'This is mostly harmless. Differences: (offset:original/backup)'
- copied original to backup (should have done backup to original oops)
- removed dirty bit
- used 'make install' to replace the kernel image in /boot
- Unchanged behavior
This appears consistent with how I'm unable to cleanly unmount the fat32 partition when the kernel fails to load, and forced to hard-reset the system.
I also used mkfs.vfat to create a fresh fat32 partition to redo the rEFInd boot method with no change in behavior.
I can understand the desire not to have the kernel on a vulnerable fat32 filesystem, but it was how I was instructed to create the EFI/UEFI partition and where I was instructed to put the kernel. It seems like rEFInd may support an ext4 (for example) /boot partition with a EFI partition mounted /boot/EFI, and then rEFInd reads the kernel file written on /boot, but it's not clear from the wiki how exactly how to achieve this and I feel like I've reasonably ruled out a fat corruption for my present issues.
---
I re-attempted a direct EFI stub. I emptied the /boot directory except for the singular file EFI/Gentoo/bzImage-5.10.52.efi with EFI/Gentoo as the only folders.
I then installed an EFI entry with the command "efibootmgr -c -d /dev/nvme0n1 -p 1 -L "Gentoo" -l '\EFI\Gentoo\bzImage-5.10.52.efi'
The kernel configuration requirements were also fulfilled, where my build-in kernel command string is "root=/dev/nvme0n1p3".
When booting this way, my computer hangs on a bios flash screen I don't normally see with no kernel output.
---
Re: pietinger:
I agree that it doesn't quite seem like a kernel problem in that I never get any output from the kernel, so it doesn't seem like the kernel actually gets executed because of this, but it should also be observed that GRUB UEFI and rEFInd both locate the kernel file and their final output reports the appropriate kernel version from the file I'd placed into the partition, linux 5.10.52, and GRUB/rEFInd's final report is that they are loading it.
---
Re: Tony0945
I would love for this to be a wrong boot parameters problem because that would be immediately fixable.
Atm I used 'make install' to drop the linux kernel into /boot which reEFInd is successfully locating and providing a boot option for. Moving the kernels and refind_linux.conf into EFI/Gentoo/ gives me a gentoo icon in the rEFInd menu but no other change in behavior.
/boot/refind_linux.conf (alongside my kernels)
https://pastebin.com/fpaTX9s3
/boot/EFI/refind/refind.conf
https://pastebin.com/0Vcrcmds
---
The last item of consideration didn't seem applicable at first, but given there is so little to work off here maybe it could be something of consideration. I have a GA-X79-UD3 motherboard from 2014 that was not released with NVMe support, nor was an official update for NVMe support ever released. Instead, I'm running a community modded BIOS that has been patched to include NVMe support. It's been attested by other users as working, and I've so far been able partition, format, read, and write execute files on my NVMe drive. fdisk speed tests show 3gb/s read/writes with the drive, which are the expected values. The EFI labeled partition on the NVMe drive populates within the BIOS menu as a boot option and these rEFInd and GRUB environments load up and work fine off my NVMe drive, it's just when I launch the kernel I get no activity and no output. This is my modded bios: https://www.win-raid.com/t2800f16-Bios-Driver-Modules-Versions-Question-For-GA-X-UD-Rev-F-modded.html#msg89146
---
Going to blow up the /boot partition again and try for a GRUB BIOS, will report my findings. |
|
Back to top |
|
|
AECFXI n00b
Joined: 28 Nov 2018 Posts: 33
|
Posted: Fri Aug 20, 2021 1:14 am Post subject: |
|
|
Also, I am not using an initramfs setup. My perception was that it was for a specific use case involving split system partitions which I don't have, but I can try setting up an initramfs if it would potentially give me a working system. |
|
Back to top |
|
|
AECFXI n00b
Joined: 28 Nov 2018 Posts: 33
|
Posted: Fri Aug 20, 2021 1:30 am Post subject: |
|
|
I GOT IT WORKING!
There was ONE SINGULAR CHANGE to my kernel that was required that made rEFInd (my most recent re-attempt at a bootloader) work perfectly.
Device Drivers -> Graphics Support -> Frame Buffer Devices ->
<*> Support for frame buffer devices --->
[*] EFI-based Framebuffer Support
CONFIG_FB_EFI was unset, and setting it made everything work in an instant. I believe this kernel configuration option should be added to the EFI System Partition gentoo wiki page as a kernel configuration requirement, I will make an edit later this evening. This configuration option referenced in this thread: https://forums.gentoo.org/viewtopic-t-1096052-start-0.html but the thread's discussion and resolution was a little bit difficult for me to parse on my first read through.
Last edited by AECFXI on Fri Aug 20, 2021 1:46 am; edited 1 time in total |
|
Back to top |
|
|
Tony0945 Watchman
Joined: 25 Jul 2006 Posts: 5127 Location: Illinois, USA
|
Posted: Fri Aug 20, 2021 1:30 am Post subject: |
|
|
First, why are you mounting root as read only?
Code: | "Default" "root=/dev/nvme0n1p3 rootfstype=f2fs ro" |
my refind_linux.conf
Code: | "Boot by PARTUUID" "root=PARTUUID=54fee329-ff75-4879-bdbb-93268b470f32 vga=0x365 net.ifnames=0 mitigations=off acpi_enforce_resources=lax "
"Boot by DEV NAME" "root=/dev/sda2 vga=0x365 net.ifnames=0 "
"Boot by CD/DVD " "root=/dev/sr0 vga=0x365 net.ifnames=0 " | You may not want all the options.
In refind.conf I have uncomented "fold_linux_kernels false" So that every kernel is shown, but you only have ne anyway.
BTW, the default is till to boot the newest which will the leftmost on the screen.
So that looks good except to the "ro" |
|
Back to top |
|
|
Tony0945 Watchman
Joined: 25 Jul 2006 Posts: 5127 Location: Illinois, USA
|
Posted: Fri Aug 20, 2021 1:36 am Post subject: |
|
|
Ah yes, the frame buffer! No, I don't think it's in any instructions. I just enabled everything EFI. Didn't think about that.
I did think "maybe the vga="
Also consider PARTUUID as in my example. But if there is only one bootable device, it is moot.
Another tip: If you want to make another kernel the default just use "touch". The actual build time isnt used, rather the file creation time which touch alters.
Welcome to Gentoo and refind! |
|
Back to top |
|
|
AECFXI n00b
Joined: 28 Nov 2018 Posts: 33
|
Posted: Fri Aug 20, 2021 1:39 am Post subject: |
|
|
Tony0945 wrote: | First, why are you mounting root as read only? |
This was the sample configuration given at: https://wiki.gentoo.org/wiki/Refind#Linux_command_line_options
I couldn't use the refind-install or mkrlconf because those populated with information from the livecd I was using to enter my system, and it was rather unclear how to edit it for the NVME drive environment. So I had to write refind_linux.conf manually with that note in the wiki as my only reference.
If you didn't catch my comment above while you were writing a reply, I solved the problem: it was, of all things, a framebuffer issue and needed CONFIG_FB_EFI set in the kernel to work correctly. |
|
Back to top |
|
|
Jaglover Watchman
Joined: 29 May 2005 Posts: 8291 Location: Saint Amant, Acadiana
|
|
Back to top |
|
|
AECFXI n00b
Joined: 28 Nov 2018 Posts: 33
|
Posted: Fri Aug 20, 2021 2:14 am Post subject: |
|
|
Tony0945 wrote: | Ah yes, the frame buffer! No, I don't think it's in any instructions. I just enabled everything EFI. Didn't think about that.
I did think "maybe the vga="
Also consider PARTUUID as in my example. But if there is only one bootable device, it is moot.
Another tip: If you want to make another kernel the default just use "touch". The actual build time isnt used, rather the file creation time which touch alters.
Welcome to Gentoo and refind! |
Haha, thank you - I've actually been using Gentoo since 2006 and usually I can solve problems on my own, but I guess my brain could use a few extra logs on the fire this week. Definitely new to club EFI/UEFI though This system in particular being maintained continuously since 2014 could perhaps be why the EFI FB ended up unset...
Thanks for the other feedback. I wasn't doing PARTUUID because I was stuck in a raw console session without copy paste and it was like my fifth time going through the steps to create a bootloader so was keeping it simple, haha. You taught me something new about about what kernel image gets picked re: touch!
Jaglover wrote: | So your kernel was booting, but you were unable to see it and thought it hung. |
It explains everything!! |
|
Back to top |
|
|
Tony0945 Watchman
Joined: 25 Jul 2006 Posts: 5127 Location: Illinois, USA
|
Posted: Fri Aug 20, 2021 2:23 am Post subject: |
|
|
2006. You've been around about as long as me. I was fooled by the "noob".
As i think about it, there s no reason for the kernel to be rw. The filesystem must be rw.
I've always made them rw.
Regarding PARTUIID. I had hoped that LABEL would work, but label works in /etc/fstab but not on kernel command line.
Yeah, our paths crossed while posting.
I spent three hours at the eye doctor today wearing a mask. my lungs and brain have not quite recovered. |
|
Back to top |
|
|
DONAHUE Watchman
Joined: 09 Dec 2006 Posts: 7651 Location: Goose Creek SC
|
Posted: Fri Aug 20, 2021 2:27 pm Post subject: |
|
|
kernel cmdline can use PARTLABEL or PARTUUID _________________ Defund the FCC. |
|
Back to top |
|
|
wjb l33t
Joined: 10 Jul 2005 Posts: 610 Location: Fife, Scotland
|
Posted: Fri Aug 20, 2021 11:30 pm Post subject: |
|
|
Maybe confirm what EFI thinks it has Code: |
efibootmgr --verbose |
Also, last year I cloned the old disk to an M2 and failed to disconnect the old disk. It was 3 weeks before I realised the old and new disks had the same ID as far as EFI was concerned, and it was very confused. |
|
Back to top |
|
|
Tony0945 Watchman
Joined: 25 Jul 2006 Posts: 5127 Location: Illinois, USA
|
Posted: Sun Aug 22, 2021 1:23 pm Post subject: |
|
|
DONAHUE wrote: | kernel cmdline can use PARTLABEL or PARTUUID |
Code: | ~ # lsblk -o name,label,partlabel,mountpoint,size,uuid
NAME LABEL PARTLABEL MOUNTPOINT SIZE UUID
sda 465.8G
├─sda1 CT500MX_EFI EFI System /boot/efi 100M DDE0-6A03
└─sda2 CT500MX_PART2 Linux filesystem / 465.7G 677edc9c-7acb-4b5c-be64-4fd1c174cf24
sdb 1.8T
└─sdb1 SAGE_VIDEO /video 1.8T ac0b2bb8-4f93-4914-9cb0-82fa3db38bec
sdc 931.5G
├─sdc1 P1-1TB 232.8G 7a98bc6d-6346-4eef-96a1-04f2047df251
├─sdc2 P2-1TB 232.8G 7407dda3-d1eb-4c0a-94b1-a989f3139b89
└─sdc3 P3-1TB /home/tony/.VirtualBox 465.8G 8603e479-3f63-40de-9430-19223f7ab050
sdd 14.9G
└─sdd1 USB16CINDY 14.9G E3A7-1DBA
sr0 1024M
| So the PARTLABEL is two words - Linux filesystem. Might it work with quotes? |
|
Back to top |
|
|
pietinger Moderator
Joined: 17 Oct 2006 Posts: 4236 Location: Bavaria
|
Posted: Sun Aug 22, 2021 6:46 pm Post subject: |
|
|
Tony0945 wrote: | So the PARTLABEL is two words - Linux filesystem. Might it work with quotes? |
No, this does not work. Do you have GPT-disks ? maybe not; and maybe this is the difference.
I have GPT and this is my output of my old system (but with SecureBoot a linux-stub-kernel):
Code: | # lsblk -o name,fstype,label,partlabel,parttypename,mountpoint,size,uuid,partuuid
NAME FSTYPE LABEL PARTLABEL PARTTYPENAME MOUNTPOINT SIZE UUID PARTUUID
sda 465,8G
├─sda1 ext4 home2 BIOS boot 2M cc77ce37-d385-4c2c-a637-5d67faab769f ed670e23-b0e4-4517-8c06-86ad31e7323c
├─sda2 vfat boot EFI System 1021M C174-0599 e6cccc9a-40e0-4b59-b6d2-082596500077
└─sda3 ext4 root Linux filesystem / 464,8G c75f64b1-a1b1-4527-b996-4b4b9d24456c 99beb5b2-b529-40fb-b0bc-3250a5237491
sdb 1,8T
├─sdb1 swap swap Linux filesystem [SWAP] 15,6G 6e3c07dd-993f-4e44-aef5-e0c3a1a8a4b9 46c90106-e5b8-4d04-8804-17f08ea6065b
├─sdb2 ext4 portage Linux filesystem /usr/portage 62,5G 217903dd-1a62-488d-a6ca-c19551e42116 d0e3ea4f-1ffe-45f4-936e-c2f717915edf
└─sdb3 ext4 hd Linux filesystem /hd 1,7T 6c49fb72-923e-4ad2-a20a-f8354042b1cd 94275610-b7be-4e62-828c-c2f96d236635 |
|
|
Back to top |
|
|
Tony0945 Watchman
Joined: 25 Jul 2006 Posts: 5127 Location: Illinois, USA
|
Posted: Sun Aug 22, 2021 10:59 pm Post subject: |
|
|
Yes, it is a gpt disk. my understanding is that you can't haven EFI booting system without gpt. |
|
Back to top |
|
|
pietinger Moderator
Joined: 17 Oct 2006 Posts: 4236 Location: Bavaria
|
Posted: Mon Aug 23, 2021 12:18 am Post subject: |
|
|
Tony0945 wrote: | [...] my understanding is that you can't haven EFI booting system without gpt. |
Yes, this is absolute true.
Tony0945 wrote: | Yes, it is a gpt disk. |
I asked for GPT, because I dont understand why you see in your "lsblk"-output wrong informations: Under "partlabel" you got in reallity the information about Partition-type (parttypename); and I think your LABELs are in reallity your PARTLABELs. Again I dont understand your output of lsblk, so I pasted mine for you. PARTLABEL is never two-worded. |
|
Back to top |
|
|
Tony0945 Watchman
Joined: 25 Jul 2006 Posts: 5127 Location: Illinois, USA
|
Posted: Mon Aug 23, 2021 12:57 am Post subject: |
|
|
Code: | ~ $ equery b lsblk
* Searching for lsblk ...
sys-apps/util-linux-2.36.2-r1 (/bin/lsblk)
sys-apps/util-linux-2.36.2-r1 (/usr/share/bash-completion/completions/lsblk)
~ $ emerge -pv util-linux
These are the packages that would be merged, in order:
Calculating dependencies... done!
[ebuild R ] sys-apps/util-linux-2.36.2-r1::gentoo USE="cramfs logger ncurses readline (split-usr) suid tty-helpers (unicode) -audit -build -caps -cryptsetup -fdformat -hardlink -kill -magic -nls -pam -python (-selinux) -slang -static-libs -su -systemd -test -udev" ABI_X86="32 (64) (-x32)" PYTHON_TARGETS="python3_8 python3_9" 0 KiB
| Dio i need different flags? this should be latest stable, I sync'd and built today. |
|
Back to top |
|
|
Goverp Advocate
Joined: 07 Mar 2007 Posts: 2014
|
Posted: Mon Aug 23, 2021 7:55 am Post subject: |
|
|
pietinger wrote: | ...PARTLABEL is never two-worded. |
He means the value of the PARTLABEL field, such as "EFI system partition" - taken from my system.
The kernel documentation for command line parameters says:
Code: | param="spaces in here" |
is how to handle spaces _________________ Greybeard |
|
Back to top |
|
|
pietinger Moderator
Joined: 17 Oct 2006 Posts: 4236 Location: Bavaria
|
Posted: Mon Aug 23, 2021 9:34 am Post subject: |
|
|
Thank you very much, Goverp ... but this I have not meant ...
Goverp wrote: | pietinger wrote: | ...PARTLABEL is never two-worded. |
He means the value of the PARTLABEL field, such as "EFI system partition" - taken from my system. |
... I dont believe that "Linux filesystem" is the value of his PARTLABEL; I think it is the value of his PARTTYPENAME (so he cant use it for the kernel command line "root=PARTLABEL=...). |
|
Back to top |
|
|
|