Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
[SOLVED] NVMe Drives Not Detected In initramfs
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
E. Vertin
n00b
n00b


Joined: 29 Jul 2024
Posts: 7
Location: Hong Kong SAR

PostPosted: Tue Jul 30, 2024 5:46 am    Post subject: [SOLVED] NVMe Drives Not Detected In initramfs Reply with quote

I have a desktop PC that uses SATA SSDs, and I installed Gentoo Linux in one of them. The kernel and initramfs generated by genkernel goes well without tinkering. The setup is btrfs as root on a SATA SSD.
When it comes to my laptop, the ASUS ProArt Studiobook 2023 (in case there are some UEFI settings that I can get your tips about), NVMe hard drives are not detected in the initramfs generated by genkernel, but dracut's works fine. I googled around but there's no similar problems. The setup is btrfs as root on a NVMe SSD.

Firstly, I compiled the Linux 6.10.2 kernel via
Code:

genkernel all --menuconfig

and I left the NVME Supoort section with Gentoo's default, that is
Quote:

│ │ <M> NVM Express block device
│ │ [*] NVMe multipath support
│ │ [ ] NVMe verbose error reporting
│ │ [*] NVMe hardware monitoring
│ │ <M> NVM Express over Fabrics RDMA host driver
│ │ <M> NVM Express over Fabrics FC host driver
│ │ <M> NVM Express over Fabrics TCP host driver
│ │ [ ] NVM Express over Fabrics In-Band Authentication
│ │ <M> NVMe Target support
│ │ [*] NVMe Target Passthrough support
│ │ <M> NVMe loopback device support
│ │ <M> NVMe over Fabrics RDMA target support
│ │ <M> NVMe over Fabrics FC target driver
│ │ <M> NVMe over Fabrics FC Transport Loopback Test driver
│ │ <M> NVMe over Fabrics TCP target support
│ │ [ ] NVMe over Fabrics In-band Authentication support


(Settings up there come from my desktop PC that uses Linux 6.6.38, I copied the config to my Laptop at first to apply my general settings that didn't include this section)

I believe that NVMe support was built as modules, the btrfs first module load and udev prompted in the initramfs as well, but NVMe block devices didn't show up when I run
Code:

ls dev

to check.

So I booted into the initramfs by dracut to use my laptop and ran
Code:

lsinitrd /boot/initramfs-<my initramfs>

to check if there's nvme modules, and there're modules named "nvme" and "nvme-core" exist in
Quote:

lib/modules/<version>-gentoo-x86_64/kernel/drivers/nvme/host/

as .ko files.

Then I went back to the initramfs created by genkernel and get to the rescue shell to ran
Code:

lsmod | grep nvme

to check whether modules are loaded, and obviously, no.

To load modules in the rescue shell, I ran
Code:

modprobe nvme nvme-core

and check modules again, loaded.

Still, there're no NVMe block devices in /dev. I tried to plug in my USB Drive and it adds /dev/sda and /dev/sda1 to the device list. What should I do to make NVMe block devices detected in the initramfs? :cry:


Secondly, I compiled the kernel with NVMe block devices support as built-in, that is
Quote:

│ │ <*> NVM Express block device


As I boot into the initramfs, NVMe block devices are not here as well, checking the module list "nvme" is not there and I can't load as modules since it is built-in support.


What should I do to make it work, if you need more information, please let me know. :oops:
Any response will be appreciated. :)

---------- [Edited 01/08/2024] ----------
Quote:

Without CONFIG_VMD=y kernel cannot access its NVME devices. Another way is; Disable it in your BIOS (it is only necessary if you use RAID).


I solved the issue by switching VMD support to built-in, since laptops are rarely seen to have a customisable UEFI BIOS that allows you to turn off VMD :P


Last edited by E. Vertin on Thu Aug 01, 2024 7:23 am; edited 1 time in total
Back to top
View user's profile Send private message
jpsollie
Guru
Guru


Joined: 17 Aug 2013
Posts: 322

PostPosted: Wed Jul 31, 2024 4:03 am    Post subject: Reply with quote

when nvme is not detected, it's probably your PCI subsystem which needs fixing.
A few tips:
- use lspci -t to see where it should be
- check if the IOMMU default setting is set to passthrough
- use driver core verbose error reporting to get a dmesg which reports everything about PCI detection issues

most useful here would be complete dmesg
_________________
The power of Gentoo optimization (not overclocked): [img]https://www.passmark.com/baselines/V10/images/503714802842.png[/img]
Back to top
View user's profile Send private message
E. Vertin
n00b
n00b


Joined: 29 Jul 2024
Posts: 7
Location: Hong Kong SAR

PostPosted: Wed Jul 31, 2024 8:26 am    Post subject: Reply with quote

jpsollie wrote:
when nvme is not detected, it's probably your PCI subsystem which needs fixing.
A few tips:
- use lspci -t to see where it should be
- check if the IOMMU default setting is set to passthrough
- use driver core verbose error reporting to get a dmesg which reports everything about PCI detection issues

most useful here would be complete dmesg


Thanks for the response :D

First of all, I boot into the distribution kernel which uses dracut to generate the initramfs and run
Code:

lspci -t

I believe that this section refers to my NVMe SSDs :) , that is
Quote:

-[10000:e0]-+-01.0
+-01.1-[e1]----00.0
\-06.0-[e2]----00.0

and it is similar to
Quote:

10000:e1:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller PM9A1/PM9A3/980PRO
10000:e2:00.0 Non-Volatile memory controller: Yangtze Memory Technologies Co.,Ltd ZHITAI TiPro7000 (rev 01)

by running
Code:

lspci

Then, I boot into my kernel and get the output, here it is
Quote:

03:00.0 Class 0604: 8086:1136
00:1c.0 Class 0604: 8086:7a38
00:08.0 Class 0880: 8086:a74f
38:00.0 Class 0c03: 8086:1138
00:15.1 Class 0c80: 8086:7a4d
03:03.0 Class 0604: 8086:1136
00:1f.0 Class 0601: 8086:7a0c
02:00.0 Class 0604: 8086:1136
00:01.0 Class 0604: 8086:a70d
00:04.0 Class 1180: 8086:a71d
00:14.3 Class 0280: 8086:7a70
00:19.2 Class 0780: 8086:7a7e
00:16.0 Class 0780: 8086:7a68
01:00.0 Class 0300: 10de:2820
00:1e.2 Class 0c80: 8086:7a2a
6e:00.0 Class 0200: 10ec:8125
00:1f.5 Class 0c80: 8086:7a24
03:02.0 Class 0604: 8086:1136
00:19.0 Class 0c80: 8086:7a7c
00:1e.0 Class 0780: 8086:7a28
00:1f.3 Class 0401: 8086:7a50
00:00.0 Class 0600: 8086:a702
00:12.0 Class 0700: 8086:7a78
6d:00.0 Class ff00: 10ec:5261
00:15.0 Class 0c80: 8086:7a4c
00:1a.0 Class 0604: 8086:7a48
00:06.0 Class 0880: 8086:09ab
03:01.0 Class 0604: 8086:1136
00:1c.6 Class 0604: 8086:7a3e
00:0e.0 Class 0104: 8086:a77f
01:00.1 Class 0403: 10de:22bd
00:14.2 Class 0500: 8086:7a27
00:1c.4 Class 0604: 8086:7a3c
04:00.0 Class 0c03: 8086:1137
00:02.0 Class 0300: 8086:a788
00:14.0 Class 0c03: 8086:7a60
00:1f.4 Class 0c05: 8086:7a23
00:15.3 Class 0c80: 8086:7a4f
00:0a.0 Class 1180: 8086:a77d


Then, I tweaked my kernel configuration and compiled it again
Quote:

CONFIG_IOMMU_DEFAULT_PASSTHROUGH=y

Is that right if I don't misunderstand :?

In the initramfs, I copied the output of dmesg to my USB drive (/dev/sda1), here it is
(Sorry for the OneDrive link, copying the log to my reply results in "error in posting")

https://1drv.ms/u/s!AoPcAPifQdWvgoxiGmQXRdp2YsHkdA?e=vfhrJ6

Looking forward to your reply :D
Back to top
View user's profile Send private message
jpsollie
Guru
Guru


Joined: 17 Aug 2013
Posts: 322

PostPosted: Wed Jul 31, 2024 9:10 am    Post subject: Reply with quote

well, it is clear to me that the kernel does not detect your device on the pci bus.
Have you enabled intel non-transparent bridge support in your kernel config?
_________________
The power of Gentoo optimization (not overclocked): [img]https://www.passmark.com/baselines/V10/images/503714802842.png[/img]
Back to top
View user's profile Send private message
E. Vertin
n00b
n00b


Joined: 29 Jul 2024
Posts: 7
Location: Hong Kong SAR

PostPosted: Wed Jul 31, 2024 9:20 am    Post subject: Reply with quote

jpsollie wrote:
well, it is clear to me that the kernel does not detect your device on the pci bus.
Have you enabled intel non-transparent bridge support in your kernel config?


Yes, I have enabled it, built as modules.
Back to top
View user's profile Send private message
jpsollie
Guru
Guru


Joined: 17 Aug 2013
Posts: 322

PostPosted: Wed Jul 31, 2024 10:16 am    Post subject: Reply with quote

try to switch the parameter iommu=on to off, add 'pci=earlydump,pcie_scan_all pci_aspm=off' to parameters,
and if it doesn't get better, change loglevel=3 to something more verbose ... 5 would be sufficient, I guess.
then provide us a new dmesg
_________________
The power of Gentoo optimization (not overclocked): [img]https://www.passmark.com/baselines/V10/images/503714802842.png[/img]
Back to top
View user's profile Send private message
E. Vertin
n00b
n00b


Joined: 29 Jul 2024
Posts: 7
Location: Hong Kong SAR

PostPosted: Wed Jul 31, 2024 10:58 am    Post subject: Reply with quote

jpsollie wrote:
try to switch the parameter iommu=on to off, add 'pci=earlydump,pcie_scan_all pci_aspm=off' to parameters,
and if it doesn't get better, change loglevel=3 to something more verbose ... 5 would be sufficient, I guess.
then provide us a new dmesg


I have edited my kernel command line, problem persists, and here is my new dmesg output.

https://1drv.ms/u/s!AoPcAPifQdWvgoxkNL72enIiLsSh5g?e=E5WJW8
Back to top
View user's profile Send private message
jpsollie
Guru
Guru


Joined: 17 Aug 2013
Posts: 322

PostPosted: Wed Jul 31, 2024 2:15 pm    Post subject: Reply with quote

well, I honestly do not see why your PCIe ports of your nvme drive are not found.
Can you provide me a "lspci -tv" output of the kernel were it is detected and where it's not?
_________________
The power of Gentoo optimization (not overclocked): [img]https://www.passmark.com/baselines/V10/images/503714802842.png[/img]
Back to top
View user's profile Send private message
E. Vertin
n00b
n00b


Joined: 29 Jul 2024
Posts: 7
Location: Hong Kong SAR

PostPosted: Wed Jul 31, 2024 3:35 pm    Post subject: Reply with quote

jpsollie wrote:
well, I honestly do not see why your PCIe ports of your nvme drive are not found.
Can you provide me a "lspci -tv" output of the kernel were it is detected and where it's not?


Okay, here's my output of the distribution kernel:
Quote:

-[0000:00]-+-00.0 Intel Corporation Device a702
+-01.0-[01]--+-00.0 NVIDIA Corporation AD106M [GeForce RTX 4070 Max-Q / Mobile]
| \-00.1 NVIDIA Corporation AD106M High Definition Audio Controller
+-02.0 Intel Corporation Raptor Lake-S UHD Graphics
+-04.0 Intel Corporation Raptor Lake Dynamic Platform and Thermal Framework Processor Participant
+-06.0 Intel Corporation RST VMD Managed Controller
+-08.0 Intel Corporation GNA Scoring Accelerator module
+-0a.0 Intel Corporation Raptor Lake Crashlog and Telemetry
+-0e.0 Intel Corporation Volume Management Device NVMe RAID Controller Intel Corporation
+-12.0 Intel Corporation Device 7a78
+-14.0 Intel Corporation Raptor Lake USB 3.2 Gen 2x2 (20 Gb/s) XHCI Host Controller
+-14.2 Intel Corporation Raptor Lake-S PCH Shared SRAM
+-14.3 Intel Corporation Raptor Lake-S PCH CNVi WiFi
+-15.0 Intel Corporation Raptor Lake Serial IO I2C Host Controller #0
+-15.1 Intel Corporation Raptor Lake Serial IO I2C Host Controller #1
+-15.3 Intel Corporation Device 7a4f
+-16.0 Intel Corporation Raptor Lake CSME HECI #1
+-19.0 Intel Corporation Device 7a7c
+-19.2 Intel Corporation Device 7a7e
+-1a.0-[02-6b]----00.0-[03-6b]--+-00.0-[04]----00.0 Intel Corporation Thunderbolt 4 NHI [Maple Ridge 4C 2020]
| +-01.0-[05-37]--
| +-02.0-[38]----00.0 Intel Corporation Thunderbolt 4 USB Controller [Maple Ridge 4C 2020]
| \-03.0-[39-6b]--
+-1c.0-[6c]--
+-1c.4-[6d]----00.0 Realtek Semiconductor Co., Ltd. RTS5261 PCI Express Card Reader
+-1c.6-[6e]----00.0 Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller
+-1e.0 Intel Corporation Device 7a28
+-1e.2 Intel Corporation Device 7a2a
+-1f.0 Intel Corporation Device 7a0c
+-1f.3 Intel Corporation Raptor Lake High Definition Audio Controller
+-1f.4 Intel Corporation Raptor Lake-S PCH SMBus Controller
\-1f.5 Intel Corporation Raptor Lake SPI (flash) Controller
-[10000:e0]-+-01.0 Intel Corporation RST VMD Managed Controller
+-01.1-[e1]----00.0 Samsung Electronics Co Ltd NVMe SSD Controller PM9A1/PM9A3/980PRO
\-06.0-[e2]----00.0 Yangtze Memory Technologies Co.,Ltd ZHITAI TiPro7000


The rescue shell doesn't seem to have this "-tv" option though.
Back to top
View user's profile Send private message
pietinger
Moderator
Moderator


Joined: 17 Oct 2006
Posts: 5050
Location: Bavaria

PostPosted: Wed Jul 31, 2024 6:24 pm    Post subject: Reply with quote

This:
Code:
+-0e.0 Intel Corporation Volume Management Device NVMe RAID Controller Intel Corporation

is not in your kernel .config ... see here: https://wiki.gentoo.org/wiki/User:Pietinger/Tutorials/Manual_Configuring_Kernel_Version_6.6#Part_3_-_Must_Haves
Without CONFIG_VMD=y kernel cannot access its NVME devices. Another way is; Disable it in your BIOS (it is only necessary if you use RAID).
_________________
https://wiki.gentoo.org/wiki/User:Pietinger
Back to top
View user's profile Send private message
E. Vertin
n00b
n00b


Joined: 29 Jul 2024
Posts: 7
Location: Hong Kong SAR

PostPosted: Thu Aug 01, 2024 7:18 am    Post subject: Reply with quote

pietinger wrote:
This:
Code:
+-0e.0 Intel Corporation Volume Management Device NVMe RAID Controller Intel Corporation

is not in your kernel .config ... see here: https://wiki.gentoo.org/wiki/User:Pietinger/Tutorials/Manual_Configuring_Kernel_Version_6.6#Part_3_-_Must_Haves
Without CONFIG_VMD=y kernel cannot access its NVME devices. Another way is; Disable it in your BIOS (it is only necessary if you use RAID).


Thank you for your response. :)

I have found that the VMD was built as modules, and everything works since I switched it as built-in support. Is that because the module was not added into the initramfs or not loading automatically so the kernel can't detect my drives?

When it comes to the UEFI BIOS settings, ASUS didn't offer us the option to turn off Intel VMD, maybe this is common for laptops...? :?

Anyway, problem solved. Many thanks. :D
Back to top
View user's profile Send private message
Hu
Administrator
Administrator


Joined: 06 Mar 2007
Posts: 22598

PostPosted: Thu Aug 01, 2024 2:44 pm    Post subject: Reply with quote

The initramfs has access to what is built into the kernel (=y), and to modules (=m) that are copied into the initramfs. If you did not do either of those, then yes, the initramfs would not have access to VMD.
Back to top
View user's profile Send private message
E. Vertin
n00b
n00b


Joined: 29 Jul 2024
Posts: 7
Location: Hong Kong SAR

PostPosted: Thu Aug 01, 2024 2:51 pm    Post subject: Reply with quote

Hu wrote:
The initramfs has access to what is built into the kernel (=y), and to modules (=m) that are copied into the initramfs. If you did not do either of those, then yes, the initramfs would not have access to VMD.


I guess it is because that I didn't copy the module to the initramfs :oops:
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum