Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
UBSAN: misaligned-access errors - iwlmvm doesn't load
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
MajorJalepino
n00b
n00b


Joined: 26 Jul 2022
Posts: 34

PostPosted: Tue May 14, 2024 11:22 am    Post subject: UBSAN: misaligned-access errors - iwlmvm doesn't load Reply with quote

Hi all,

I've been having trouble getting my wifi card to be detected on boot, the systemd service fails and there are a bunch of misaligned access errors from UBSAN in DMESG that repeat.

usr/src/linux/.config:
http://0x0.st/XKgE.txt

DMESG:
http://0x0.st/XKgI.txt

As you can see the UBSAN errors repeat, my kernel buffer can't keep up (set lower in .config)

systemctl --type service
http://0x0.st/XKg0.txt

lspci -nnk
http://0x0.st/XKgk.txt

If I manually run "sudo modprobe iwlmvm" the wifi shows up in ifconfig:
http://0x0.st/XKgn.txt

iw wlan0 info
http://0x0.st/XKg5.txt

Other details:

I've compiled this kernel with LLVM=1

Code:
# uname -a
Linux R95900X 6.8.8-gentoo #2 SMP PREEMPT_DYNAMIC Tue May 14 20:40:02 AEST 2024 x86_64 AMD Ryzen 9 5900X 12-Core Processor AuthenticAMD GNU/Linux


I've tried to solve this myself before asking, I think it may be kernel issues that I cannot pin down given the DMESG errors. I'm on Kernel 6.8.8 compiled with Clang for Gentoo User reasons. I've tried to configure following Pietinger's guides as per this Wiki page : https://wiki.gentoo.org/wiki/User:Pietinger/Tutorials/Manual_Configuring_Kernel_Version_6.6#Part_3_-_Must_Haves

Thanks!
Back to top
View user's profile Send private message
pietinger
Moderator
Moderator


Joined: 17 Oct 2006
Posts: 5110
Location: Bavaria

PostPosted: Tue May 14, 2024 11:52 am    Post subject: Reply with quote

A recommendation from Kees Cook is not to enable every UBSAN option ... see in this chapter:
https://wiki.gentoo.org/wiki/User:Pietinger/Tutorials/Kernel_Hardening_with_KSPP#Course_of_action

If this does not help I would do two things:

1. Installing our binary Gentoo dist-kernel and check if there are any problems. If yes, maybe there is a hardware problem. If not:

2. Take the same .config - without every UBSAN option enabled - AND without CONFIG_MZEN3=y (best is to emerge gentoo-sources again without use-flag "experimental") AND compile it with GCC. If there is no problem, then I would stay on this kernel. If there is also a problem then there might be a problem with the .config.
_________________
https://wiki.gentoo.org/wiki/User:Pietinger
Back to top
View user's profile Send private message
MajorJalepino
n00b
n00b


Joined: 26 Jul 2022
Posts: 34

PostPosted: Tue May 14, 2024 12:35 pm    Post subject: Reply with quote

The man himself!

Thanks for your help, I'll report back soon.

I get the vibe that KSPP is not compatible with Wifi or Bluetooth; is this the case?
Back to top
View user's profile Send private message
pietinger
Moderator
Moderator


Joined: 17 Oct 2006
Posts: 5110
Location: Bavaria

PostPosted: Tue May 14, 2024 2:07 pm    Post subject: Reply with quote

MajorJalepino wrote:
I get the vibe that KSPP is not compatible with Wifi or Bluetooth; is this the case?

No. 8)

MajorJalepino wrote:
The man himself!

:lol:
_________________
https://wiki.gentoo.org/wiki/User:Pietinger
Back to top
View user's profile Send private message
MajorJalepino
n00b
n00b


Joined: 26 Jul 2022
Posts: 34

PostPosted: Tue May 14, 2024 9:28 pm    Post subject: Reply with quote

Alright so I got rid of all the UBSAN messages first, got to read the proper DMESG.

Then I tried starting the iwd.service manually, which failed. After reading the journalctl of that I was told I needed to enable these kernel options relating to cryptographic API:
Code:

CRYPTO_AES
USER_API_HASH
CRYPTO_CMAC
AES_X86_X64


These were all already enabled, except the last one which doesn't seem to be an option in 6.8.8.

I've then tried to build the modules for wifi into the kernel incase they were loading out-of-order, new DMESG:
https://bpa.st/raw/LRVQ

As you can see, some errors saying microcode isn't available which it absolutely is, see outputs of:

modinfo iwlwifi | grep ty
https://bpa.st/raw/CA2Q

ls -l /lib/firmware/iwlwifi-ty*
https://bpa.st/raw/66CQ

I guess next step this afternoon will be a genkernel and peel back from there.
Back to top
View user's profile Send private message
pietinger
Moderator
Moderator


Joined: 17 Oct 2006
Posts: 5110
Location: Bavaria

PostPosted: Tue May 14, 2024 10:11 pm    Post subject: Reply with quote

Maybe you have forgot this chapter ? -> https://wiki.gentoo.org/wiki/User:Pietinger/Tutorials/Manual_kernel_configuration#Driver_needs_Firmware

(If you enable a module - which needs firmware - statically into the kernel then you must also give it into:
CONFIG_EXTRA_FIRMWARE="amd-ucode/microcode_amd_fam17h.bin" )

Best is to load IWLWIFI (and all necessary modules) as <M>odule.

I have found something, which can make strange problems ... and I have updated my wiki article (6.6) with this information before some days:

Some AMD CPUs have a strange behaviour: Even if the number of logical cores is 24, some AMD CPUs need more ->
Code:
[    0.034739] smpboot: 32 Processors exceeds NR_CPUS limit of 24
[    0.034739] smpboot: Allowing 24 CPUs, 0 hotplug CPUs
...
[    2.065828] smpboot: CPU0: AMD Ryzen 9 5900X 12-Core Processor (family: 0x19, model: 0x21, stepping: 0x2)

=> You MUST change CONFIG_NR_CPUS to 32 ... without it can come to very strange problems.
_________________
https://wiki.gentoo.org/wiki/User:Pietinger
Back to top
View user's profile Send private message
pietinger
Moderator
Moderator


Joined: 17 Oct 2006
Posts: 5110
Location: Bavaria

PostPosted: Tue May 14, 2024 10:34 pm    Post subject: Reply with quote

P.S.: Dont worry with crypto ... it is everything okay ... the modules enable automatically what you need:

You start with:
Code:
CONFIG_CFG80211
->
Selects: FW_LOADER [=y] && CRC32 [=y] && CRYPTO_SHA256 [=y]

and

CONFIG_IWLWIFI
->
Depends on: NETDEVICES [=y] && WLAN [=y] && WLAN_VENDOR_INTEL [=y] && PCI [=y] && HAS_IOMEM [=y] && CFG80211 [=y] && (IWLMEI [=n] || !IWLMEI [=n])

IWLWIFI needs:
Code:
CONFIG_IWLDVM
->
Depends on: NETDEVICES [=y] && WLAN [=y] && WLAN_VENDOR_INTEL [=y] && IWLWIFI [=y] && MAC80211 [=y]

- AND/OR -

CONFIG_IWLMVM
->
Depends on: NETDEVICES [=y] && WLAN [=y] && WLAN_VENDOR_INTEL [=y] && IWLWIFI [=y] && MAC80211 [=y] && PTP_1588_CLOCK_OPTIONAL [=y]

and this you will get only after enabling:
Code:
CONFIG_MAC80211
->
Selects: CRYPTO [=y] && CRYPTO_LIB_ARC4 [=y] && CRYPTO_AES [=y] && CRYPTO_CCM [=y] && CRYPTO_GCM [=y] && CRYPTO_CMAC [=y] && CRC32 [=y]

You see: Every crypto module was selected ... and I have seen you have enabled the accelerated mdoules for them (CONFIG_CRYPTO_AES_NI_INTEL, CONFIG_CRYPTO_SHA256_SSSE3, CONFIG_CRYPTO_CRC32_PCLMUL, CONFIG_CRYPTO_CRC32C_INTEL) ... all fine

.... BUT ...

You must allow userspace to access these modules ... => # CONFIG_CRYPTO_USER is not set :lol:
_________________
https://wiki.gentoo.org/wiki/User:Pietinger
Back to top
View user's profile Send private message
MajorJalepino
n00b
n00b


Joined: 26 Jul 2022
Posts: 34

PostPosted: Wed May 15, 2024 11:41 am    Post subject: Reply with quote

Thanks for the pointers here Piet,

I've mad the change to no. of CPU's in the kernel, here are the errors I'm still receiving in dmesg:

Code:
 $ sudo dmesg | grep Cannot
[    2.296268] ccp_crypto: Cannot load: there are no available CCPs
[    5.636578] systemd-gpt-auto-generator[248]: File system behind root file system is reported by btrfs to be backed by pseudo-device /dev/root, which is not a valid userspace accessible device node. Cannot determine correct backing block device.
[    5.869703] snd_hda_intel 0000:0a:00.1: Cannot probe codecs, giving up


Code:
$ sudo dmesg | grep failed
[    2.182116] ACPI: _OSC evaluation for CPUs failed, trying _PDC
[    3.345626] platform regulatory.0: Direct firmware load for regulatory.db failed with error -2
[    3.345627] cfg80211: failed to load regulatory.db
[    4.509231] ata5: failed to resume link (SControl 0)
[    5.549504] ata6: failed to resume link (SControl 0)
[    5.638972] (sd-exec-[243]: /usr/lib/systemd/system-generators/systemd-gpt-auto-generator failed with exit status 1.
[  586.036701] thermal thermal_zone0: failed to read out thermal zone (-61)


Code:
$ sudo dmesg | grep Invalid
[  586.044065] iwlwifi 0000:07:00.0: WRT: Invalid buffer destination
[  586.349289] iwlwifi 0000:07:00.0: WRT: Invalid buffer destination


I believe the last piece of the puzzle is regulatory.db - what is that?

Also, not sure why I'm getting errors about the CCP at the start, I have an encryption controller.

Code:
$ lspci -nnk | grep Encryption
0c:00.1 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Cryptographic Coprocessor PSPCPP [1022:1486]
Back to top
View user's profile Send private message
Hu
Administrator
Administrator


Joined: 06 Mar 2007
Posts: 22657

PostPosted: Wed May 15, 2024 12:00 pm    Post subject: Reply with quote

MajorJalepino wrote:
Code:
$ sudo dmesg | grep failed
[    3.345626] platform regulatory.0: Direct firmware load for regulatory.db failed with error -2
[    3.345627] cfg80211: failed to load regulatory.db
I believe the last piece of the puzzle is regulatory.db - what is that?
That is a file which tells the wireless driver what forms of operation are permitted in your jurisdiction. Different countries have different rules about maximum legal transmission power, and the frequencies on which you may transmit without a special license. See Documentation/networking/regulatory.rst for some additional context, although that appears to be geared primarily toward advanced users. This file allows the kernel, in conjunction with some configuration from userspace, to perform at the fullest extent allowed by local regulation.
MajorJalepino wrote:
Code:
 $ sudo dmesg | grep Cannot
[    2.296268] ccp_crypto: Cannot load: there are no available CCPs
Also, not sure why I'm getting errors about the CCP at the start, I have an encryption controller.
Code:
$ lspci -nnk | grep Encryption
0c:00.1 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Cryptographic Coprocessor PSPCPP [1022:1486]
You have a CCP, but is it a CCP that your current kernel configuration knows how to operate?
Back to top
View user's profile Send private message
MajorJalepino
n00b
n00b


Joined: 26 Jul 2022
Posts: 34

PostPosted: Wed May 15, 2024 12:30 pm    Post subject: Reply with quote

Great thank you.

I have moved cfg80211 back to a module and now it all loads fine. iwlmvm is not started by udev howver, I must manually start it with modprobe. Is there a way to fix this?

How would can I test if the CCP driver matches the one I have in my system?

Does anyone have any ideas about the snd_hda_intel failing?
Back to top
View user's profile Send private message
pietinger
Moderator
Moderator


Joined: 17 Oct 2006
Posts: 5110
Location: Bavaria

PostPosted: Wed May 15, 2024 2:41 pm    Post subject: Reply with quote

To get regulatory.db you need to emerge net-wireless/wireless-regdb

MajorJalepino wrote:
[...] iwlmvm is not started by udev howver, I must manually start it with modprobe. Is there a way to fix this?

Sorry, I am not a systemd man ... but some systemd experts here will surely help ;-)

MajorJalepino wrote:
How would can I test if the CCP driver matches the one I have in my system?

Usually, if a module says "I will not work here" then you have not the hardware for it (or not enabled in BIOS) .... maybe you have only an AMD Platform Security Processor and not AMD Cryptographic Coprocessor (but I am not a AMD expert)

MajorJalepino wrote:
Does anyone have any ideas about the snd_hda_intel failing?

Best is to search for every sound-module with "lsmod":
https://wiki.gentoo.org/wiki/User:Pietinger/Tutorials/Manual_kernel_configuration#Before_you_start

If your system uses SOF then you will need also: sys-firmware/sof-firmware
_________________
https://wiki.gentoo.org/wiki/User:Pietinger
Back to top
View user's profile Send private message
MajorJalepino
n00b
n00b


Joined: 26 Jul 2022
Posts: 34

PostPosted: Sat May 18, 2024 3:52 pm    Post subject: Reply with quote

Alrighty then,

I have fixed all the issues I've brought up here:
1. Built a monolithic kernel - need to include CPU microcode, GPU firmware blobs, and:
Code:
 /lib/firmware/iwlwifi-ty-a0-gf-a0-86.ucode
iwlwifi-ty-a0-gf-a0.pnvm
regulatory.db
regulatory.db.p7s
rtl_nic/trl8168h-2.fw


2. I switched to XFS root partition (do not need any of the features of btrfs - snapshots are not backups as I thought, more trouble than it's worth).

3. Added CONFIG_SP5100_TCO driver (was included in binary Linux Mint kernel I booted from)
4. Disabled CCP support in Kernel - it seems this functionality is turned off with my mobo/bios version (ASRock B550M Pro4 - bios from Sep 23).

5. I really do not know how I fixed SND_HDA_INTEL loading, I suspect it might have been out-of-order as a module, with everything compiled-in it probably loads in the correct order.

Thanks all!
Back to top
View user's profile Send private message
pietinger
Moderator
Moderator


Joined: 17 Oct 2006
Posts: 5110
Location: Bavaria

PostPosted: Sat May 18, 2024 4:14 pm    Post subject: Reply with quote

MajorJalepino wrote:
Thanks all!

You are very Welcome ! :D

Have fun with Gentoo ! 8)
_________________
https://wiki.gentoo.org/wiki/User:Pietinger
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum