View previous topic :: View next topic |
Author |
Message |
brendlefly62 Apprentice
Joined: 19 Dec 2009 Posts: 150
|
Posted: Fri Aug 09, 2024 8:35 pm Post subject: [solved] mcpu - kernel dependency (was u-boot) |
|
|
I've got a Rock 5C system (quad-core rockchip rk3588s with 4 x Cortex-A76 + 4 x Cortex-A55) booting with a custom initramfs supporting an nvme stick with a luks encrypted partition hosting a number of logical volumes, for root, usr, var, home, etc The system boots up cleanly, but...
I've noticed the "+crypto" capability is missing when I boot from initramfs and run -- Code: | # resolve-march-native
-mcpu=cortex-a76.cortex-a55+crc | .
If I chroot into the same system, using the same kernel, I get this instead -- Code: | # resolve-march-native
-mcpu=cortex-a76.cortex-a55+crc+crypto |
I discovered this after noticing that packages I build in chroot failed with "illegal instruction" when I tried using them on the system booted with my custom initramfs. So - I've concluded that there must be something missing in the way my initramfs does its minimal startup before handing the system over with switch_root ... I've suspected kernel CONFIG_ settings for crypto acceleration, but I have not yet identified a culprit -- likely some CONFIG_ setting set to [m] and then not loaded by my intramfs? (but loaded automatically when the same kernel boots on a rootfs that doesn't need to be decrypted and mounted by an initramfs first...?)
I've also compared the output of lscpu -- the initramfs-booted system is missing "aes pmull sha1 sha2" compared to this output from the chroot - Code: | Model name: Cortex-A55
...
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp
Model name: Cortex-A76
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp |
Can anyone help me narrow the search for what might be missing?
I've compared lsmod output from this system with the chroot, I've compared lscpu, tried activating modules with modprobe after boot, but no joy...
Last edited by brendlefly62 on Sat Aug 10, 2024 3:56 pm; edited 1 time in total |
|
Back to top |
|
|
pingtoo Veteran
Joined: 10 Sep 2021 Posts: 1359 Location: Richmond Hill, Canada
|
Posted: Fri Aug 09, 2024 10:27 pm Post subject: |
|
|
Could it be your initramfs have different GCC than your rootfs GCC?
As far as I can tell the resolve-march-native just call GCC to parse its output. so I am guessing the GCC in initramfs produce different output?
Above is just about reported information, it does not mean it is actually being different.
The error about "illegal instruction" is truly about the binary inside initramfs, I suspect it is not crypto related because CPU Features cannot be configured as linux kernel module to turn on or off.
It would be nice if you can get a core dump for the failed program to examine which instruction cause this program.
The other possibility is your binary in the initramfs is corrupted. may be try to recreate a new binary if you successful find which binary generate the error. |
|
Back to top |
|
|
brendlefly62 Apprentice
Joined: 19 Dec 2009 Posts: 150
|
Posted: Sat Aug 10, 2024 1:19 am Post subject: |
|
|
Hi, Pingtoo. The gcc in the chroot and the initramfs-booted system are the same --
initramfs-booted -- Code: | # gcc -v
Using built-in specs.
COLLECT_GCC=aarch64-unknown-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/aarch64-unknown-linux-gnu/13/lto-wrapper
Target: aarch64-unknown-linux-gnu
Configured with: /var/tmp/portage/sys-devel/gcc-13.3.1_p20240614/work/gcc-13-20240614/configure --host=aarch64-unknown-linux-gnu --build=aarch64-unknown-linux-gnu --prefix=/usr --bindir=/usr/aarch64-unknown-linux-gnu/gcc-bin/13 --includedir=/usr/lib/gcc/aarch64-unknown-linux-gnu/13/include --datadir=/usr/share/gcc-data/aarch64-unknown-linux-gnu/13 --mandir=/usr/share/gcc-data/aarch64-unknown-linux-gnu/13/man --infodir=/usr/share/gcc-data/aarch64-unknown-linux-gnu/13/info --with-gxx-include-dir=/usr/lib/gcc/aarch64-unknown-linux-gnu/13/include/g++-v13 --disable-silent-rules --disable-dependency-tracking --with-python-dir=/share/gcc-data/aarch64-unknown-linux-gnu/13/python --enable-languages=c,c++,fortran --enable-obsolete --enable-secureplt --disable-werror --with-system-zlib --disable-nls --disable-libunwind-exceptions --enable-checking=release --with-bugurl=https://bugs.gentoo.org/ --with-pkgversion='Gentoo 13.3.1_p20240614 p17' --with-gcc-major-version-only --enable-libstdcxx-time --enable-lto --disable-libstdcxx-pch --enable-shared --enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu --disable-multilib --disable-fixed-point --enable-libgomp --disable-libssp --disable-libada --disable-standard-branch-protection --disable-systemtap --disable-valgrind-annotations --disable-vtable-verify --disable-libvtv --with-zstd --without-isl --enable-default-pie --enable-default-ssp --disable-fixincludes
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 13.3.1 20240614 (Gentoo 13.3.1_p20240614 p17) |
chrooted -- Code: | # gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/aarch64-unknown-linux-gnu/13/lto-wrapper
Target: aarch64-unknown-linux-gnu
Configured with: /var/tmp/portage/sys-devel/gcc-13.3.1_p20240614/work/gcc-13-20240614/configure --host=aarch64-unknown-linux-gnu --build=aarch64-unknown-linux-gnu --prefix=/usr --bindir=/usr/aarch64-unknown-linux-gnu/gcc-bin/13 --includedir=/usr/lib/gcc/aarch64-unknown-linux-gnu/13/include --datadir=/usr/share/gcc-data/aarch64-unknown-linux-gnu/13 --mandir=/usr/share/gcc-data/aarch64-unknown-linux-gnu/13/man --infodir=/usr/share/gcc-data/aarch64-unknown-linux-gnu/13/info --with-gxx-include-dir=/usr/lib/gcc/aarch64-unknown-linux-gnu/13/include/g++-v13 --disable-silent-rules --disable-dependency-tracking --with-python-dir=/share/gcc-data/aarch64-unknown-linux-gnu/13/python --enable-languages=c,c++,fortran --enable-obsolete --enable-secureplt --disable-werror --with-system-zlib --disable-nls --disable-libunwind-exceptions --enable-checking=release --with-bugurl=https://bugs.gentoo.org/ --with-pkgversion='Gentoo 13.3.1_p20240614 p17' --with-gcc-major-version-only --enable-libstdcxx-time --enable-lto --disable-libstdcxx-pch --enable-shared --enable-threads=posix --enable-__cxa_atexit --enable-clocale=gnu --disable-multilib --disable-fixed-point --enable-libgomp --disable-libssp --disable-libada --disable-standard-branch-protection --disable-systemtap --disable-valgrind-annotations --disable-vtable-verify --disable-libvtv --with-zstd --without-isl --enable-default-pie --enable-default-ssp --disable-fixincludes
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 13.3.1 20240614 (Gentoo 13.3.1_p20240614 p17 |
Note that lscpu also reports different information. (so does hwinfo)
The kernel used in both cases was compiled with a different gcc -- Code: | # uname -a
Linux rock5c6403 6.1.75-vendor-rk35xx #1 SMP Tue Jul 30 04:56:00 EDT 2024 aarch64 GNU/Linux |
This was produced by the Armbian/build process as described in the top half of this wiki article https://wiki.gentoo.org/wiki/User:Brendlefly62/Rockchip_RK3588S_OrangePi_5B/Build-Install-Kernel
The "illegal instruction" error occurs not in the initramfs, but in the rootfs after it is booted by the initramfs...
I don't understand why the reported hardwre capability of the cpu would appear to be different like this - it is the same SOC running the same kernel binary. The only difference I'm aware of is that the chrooted system was initialized by an Armbian initrd on a an Armbian rootfs with systemd (and then chrooted into a gentoo rootfs), whereas the initramfs-booted system does very little more than unlock and mount the luks lvm filestructure and then switch_root to the same gentoo rootfs.
When you say Quote: | The other possibility is your binary in the initramfs is corrupted. may be try to recreate a new binary if you successful find which binary generate the error. | I assume you mean the kernel-binary. As far as I can tell, the kernel in the system from which I chroot is identical to thet one on the boot device with the intiramfs -- both report the same size and sha512sum hash.
There are a lot of other binaries inside the initramfs image; it is generated using a custom process based on this Gentoo wiki - https://wiki.gentoo.org/wiki/Custom_Initramfs
So that's why I thought it must either be something that the Armbian initrd or Armbian rootfs has that my initramfs and/or gentoo rootfs do not have (or have differently)...
Also, note the "Hardware crypto devices" and "Accelerated Cryptio..." sections of kernel-config -- don't these select and enable hardware features?
--- Cryptographic API
│ │ Crypto core or helper --->
│ │ Public-key cryptography --->
│ │ Block ciphers --->
│ │ Length-preserving ciphers and modes --->
│ │ AEAD (authenticated encryption with associated data) ciphers --->
│ │ Hashes, digests, and MACs --->
│ │ CRCs (cyclic redundancy checks) --->
│ │ Compression --->
│ │ Random number generation --->
│ │ Userspace interface --->
│ │ Accelerated Cryptographic Algorithms for CPU (arm64) --->
│ │ [*] Hardware crypto devices --->
│ │ -*- Asymmetric (public-key cryptographic) key type --->
│ │ Certificates for signature checking --->
I've never tried to do a "core dump" before - do you know if there is a tutorial on that?
thanks again |
|
Back to top |
|
|
pietinger Moderator
Joined: 17 Oct 2006 Posts: 5277 Location: Bavaria
|
Posted: Sat Aug 10, 2024 2:15 am Post subject: |
|
|
pingtoo wrote: | [...] because CPU Features cannot be configured as linux kernel module to turn on or off. |
All modern processors today have one (yes: more) registers in which each bit describes a specific capability of the processor. These registers (Control register; Extended Control register *1) are queried with lscpu and with cpuid and the program then outputs either the corresponding memnonic (lscpu) or true/false (cpuid). However, these registers are writable. Yes, it would make no sense (I don't even know if it is possible) to activate a capability that the cpu does not have == set a bit to 1. What is done, however, is to disable a capability if necessary ... this can be done by any program that runs on ring 0 - including the kernel. See this example:
https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?h=x86/urgent&id=ce0b15d11ad837fbacc5356941712218e38a0a83
Explained also here: https://www.phoronix.com/news/Intel-Disable-PCID-ADL-RPL
So, yes it is possible to get different outputs from lscpu and cpuid ... but not because someone has something activated ... the problem is: "Someone" has it deactivated ...
Now, I dont know the system of PO ... but we have not only a kernel ... there is a boot system "before" ... AND ... many modern CPU load a microcode for its CPU itself .... Yes, Intel had deactivated in the past a capability of its CPU via a new microcode (was not funny after the microcode update) ...
Sorry I can not help here further, but the first thing I would check is the used/loaded microcode.
(AFAIK a microcode can also enable a CPU function if the microcode provides it ... but I can be wrong here)
*1) Example for x86: https://wiki.osdev.org/CPU_Registers_x86
P.S.: e.g. pmull is a arm specific insctruction: https://developer.arm.com/documentation/dui0801/g/A64-SIMD-Vector-Instructions/PMULL--PMULL2--vector-
(link needs javascript) _________________ https://wiki.gentoo.org/wiki/User:Pietinger |
|
Back to top |
|
|
pingtoo Veteran
Joined: 10 Sep 2021 Posts: 1359 Location: Richmond Hill, Canada
|
Posted: Sat Aug 10, 2024 1:32 pm Post subject: |
|
|
brendlefly62,
My apology, I have misunderstood your situation.
Core dump, is a kernel feature. I think modern kernel allow it to be disable so I am not sure if your kernel had it disabled or not. But if it not disabled usually it will automatically produced at the current working directory the binary execute that generate the error. you can try to find it by Code: | find / -name core -ls | There will be several output line. you need to find the one that you think it is relate to the binary that generate the error. Hint it is not /proc/sys/net/core
Those kernel crypto configuration were to allow user/kernel space program to use those facility with standard API, not to enable or disable them in CPU.
I cannot say why your got different output when under initrd vs normal rootfs, my GCC question is just a guess.
Let me think about how to approach this issue. |
|
Back to top |
|
|
pingtoo Veteran
Joined: 10 Sep 2021 Posts: 1359 Location: Richmond Hill, Canada
|
Posted: Sat Aug 10, 2024 2:02 pm Post subject: |
|
|
brendlefly62,
I just realise my mind still stuck in this problem is booting sequence problem, not a single program problem. So I miss asking correct questions.
Question, - Did you get the "illegal instruction" by manually execute some program?
- How did you arrive to a state you can run chroot?
|
|
Back to top |
|
|
brendlefly62 Apprentice
Joined: 19 Dec 2009 Posts: 150
|
Posted: Sat Aug 10, 2024 3:01 pm Post subject: |
|
|
Thanks, pingtoo and pietinger; your replies helped me identify a fix. Evidently, it was in fact a booting sequence problem as pingtoo was thinking - in which something had deactivated some cpu capabilities, as in the "microcode" idea pietinger articulated. I'll edit the OP to include "[solved]"
This morning I realized one more difference between the "chroot" and "initramfs" scenarios described above -- although the files in /boot/ were identical (other than initrd vs initramfs) in both bootfs scenarios, I had written a different (custom-compiled) set of u-boot binaries to the micro-SD card supporting the initramfs.
(I had documented that in this (now updated) Gentoo wiki article -- https://wiki.gentoo.org/wiki/User:Brendlefly62/Rockchip_RK3588S_Rock_5c/Build-Install-U-Boot)
So - today, I built a new uSD card with ext4 bootfs starting at sector 32768 and I flashed the new uSD card with the idbloader.img and u-boot.itb produced by Armbian. Then I copied all content from the old "initramfs" uSD card's ext4 bootfs into the new one. and voila - it works!
Now after booting with my initramfs -- Code: | # resolve-march-native
-mcpu=cortex-a76.cortex-a55+crc+crypto
# lscpu
...
Model name: Cortex-A55
...
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp
Model name: Cortex-A76
...
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp
|
Here's the u-boot re-flash procedure -- Code: | To re-flash u-boot on your boot device, with the u-boot image compiled for the current build, follow these instructions, based on armbian's platform_install.sh which can be found in
u-boot_reflash_resources/usr/lib/u-boot/ --
# (as root)
cd /path/to/linux-u-boot-edge-rock-5c/
target=<devicename>
dd if=idbloader.img of=${target} seek=64 conv=notrunc status=none;
dd if=u-boot.itb of=${target} seek=16384 conv=notrunc status=none;
|
Here is the layout for this system -- Code: | rock5c6403 ~ # fdisk -l /dev/mmcblk1
Disk /dev/mmcblk1: 14.88 GiB, 15978201088 bytes, 31207424 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 8C7EAFAD-3436-4F35-8009-91B450FDC1B2
Device Start End Sectors Size Type
/dev/mmcblk1p1 32768 31205375 31172608 14.9G Linux filesystem
rock5c6403 ~ # lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sda 8:0 1 14.6G 0 disk
`-sda1 8:1 1 14.6G 0 part
mmcblk1 179:0 0 14.9G 0 disk
`-mmcblk1p1 179:1 0 14.9G 0 part
zram0 252:0 0 0B 0 disk
nvme0n1 259:0 0 931.5G 0 disk
|-nvme0n1p1 259:1 0 485M 0 part
`-nvme0n1p2 259:2 0 931G 0 part
`-ev012 253:0 0 931G 0 crypt
|-vg_nvmerock5c6403-swap 253:1 0 8G 0 lvm [SWAP]
|-vg_nvmerock5c6403-root 253:2 0 10G 0 lvm /
|-vg_nvmerock5c6403-usr 253:3 0 35G 0 lvm /usr
|-vg_nvmerock5c6403-var 253:4 0 100G 0 lvm /var
|-vg_nvmerock5c6403-tmp 253:5 0 50G 0 lvm /tmp
|-vg_nvmerock5c6403-opt 253:6 0 4G 0 lvm /opt
|-vg_nvmerock5c6403-home 253:7 0 200G 0 lvm /home
|-vg_nvmerock5c6403-srv 253:8 0 500G 0 lvm /srv
`-vg_nvmerock5c6403-extra 253:9 0 24G 0 lvm
|
/dev/sda1 provides a key, used by the initramfs to decrypt /dev/nvme0n1p2. /dev/nvme0n1p1 is not currently used - my intent was to use that as the bootfs, but the rock 5c (with no SPI module and no eMMC) does not seem to include the pcie nvme device in its list of possible places to find a boot loader - so I use the SD card for bootfs, sice I seem to need it for u-boot anyway...
Oh, and to answer your last questions, pingtoo - the "illegal instruction" was occuring when running the initramfs-booted system and trying to execute something like wget <url> or getuto (which evidently were built to rely on the cpu's crypto (https and gpg, I presume). And I got to chroot by booting the rock 5c board using a uSD card image built using an Armbian kernel and initrd and then "manually" using cryptsetup luksOpen to unlock the nvme partition and lvm (vgscan/vgchange) to mount the lv's that constitute my gentoo rootfs. In other words, by "manually" doing what the initramfs does...
And, yes, I had verified that both scenarios were using the same device tree file - rk3588s-rock-5c.dtb
Last edited by brendlefly62 on Sat Aug 10, 2024 3:55 pm; edited 2 times in total |
|
Back to top |
|
|
pingtoo Veteran
Joined: 10 Sep 2021 Posts: 1359 Location: Richmond Hill, Canada
|
Posted: Sat Aug 10, 2024 3:34 pm Post subject: |
|
|
brendlefly62,
Thank you for your update.
I had forgot about u-boot It does have a role in the control CPU initialization, therefor it is possible to have lesser CPU feature during u-boot vs linux kernel. It just never occure to me that why inited and chroot have different CPU feature list, because by the time initrd started, the kernel already did its work for setup CPU (base on device tree) I would imagining that initrd should have same device tree as when the rootfs come up. |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|