Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
[SOLVED] Trouble to get nvidia-driver working after updating
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
kgdrenefort
Apprentice
Apprentice


Joined: 19 Sep 2023
Posts: 295
Location: Somewhere in the 77

PostPosted: Mon Sep 16, 2024 11:14 am    Post subject: [SOLVED] Trouble to get nvidia-driver working after updating Reply with quote

Hello,

TL;DR: Zen on the unofficial discord server says and assure me that it was a bug, see my answers below.

Since months I'm facing a problem on two desktop. Each time the x11-drivers/nvidia-drivers is updated, the next reboot fails to make it works. Seems I have, if my memory is good, to use this command:

Code:
emerge --ask @module-rebuild


Then, reboot and it works.

I can use GUI and my desktop, but at some points you realize stuff are wrong:

- Games fail to launch
- Desktop effect can be slow (btw)
- Some software build with grahics acceleration seems to suffer, as my firefox, had to install firefox-bin until I realize it was probably linked. Firefox just display what's behind the window and isn't usable, even after resizing for example. As you can see here.
- Plasma try to tell me something, each time on these desktop the problem appear, the loading icon is spinning like really fast, which is not the case when everything is good.
- Plasma says I use llvmpipe as graphical processor.

I tried to apply the way to make it automatic, but I should have missed something because as said above, it doesn't works.

Example on the main desktop (let's keep it simple), using a 3060Ti with this settings:

Code:
Portage 3.0.65 (python 3.12.6-final-0, default/linux/amd64/23.0/hardened/systemd, gcc-13, glibc-2.39-r6, 6.6.47-gentoo-dist-hardened x86_64)
=================================================================
System uname: Linux-6.6.47-gentoo-dist-hardened-x86_64-AMD_Ryzen_5_2600_Six-Core_Processor-with-glibc2.39
KiB Mem:    32785820 total,   2185836 free
KiB Swap:   16777212 total,  16148732 free
Timestamp of repository gentoo: Mon, 16 Sep 2024 05:30:00 +0000
Head commit of repository gentoo: b8ad1e048bc0e7d8aeca6135f604df39d5d45c12
Timestamp of repository steam-overlay: Sun, 08 Sep 2024 18:36:43 +0000
Head commit of repository steam-overlay: 9e11573f22a5ab039769afea81b31ccd89af454e

sh bash 5.2_p26-r6
ld GNU ld (Gentoo 2.42 p3) 2.42.0
app-misc/pax-utils:        1.3.7::gentoo
app-shells/bash:           5.2_p26-r6::gentoo
dev-build/autoconf:        2.13-r8::gentoo, 2.71-r7::gentoo
dev-build/automake:        1.16.5-r2::gentoo
dev-build/cmake:           3.30.2::gentoo
dev-build/libtool:         2.4.7-r4::gentoo
dev-build/make:            4.4.1-r1::gentoo
dev-build/meson:           1.5.1::gentoo
dev-java/java-config:      2.3.4::gentoo
dev-lang/perl:             5.40.0::gentoo
dev-lang/python:           3.10.15::gentoo, 3.12.6::gentoo
dev-lang/rust-bin:         1.80.1::gentoo
sys-apps/baselayout:       2.15::gentoo
sys-apps/sandbox:          2.39::gentoo
sys-apps/systemd:          255.7-r1::gentoo
sys-devel/binutils:        2.42-r1::gentoo
sys-devel/binutils-config: 5.5::gentoo
sys-devel/clang:           18.1.8::gentoo
sys-devel/gcc:             13.3.1_p20240614::gentoo
sys-devel/gcc-config:      2.11::gentoo
sys-devel/lld:             18.1.8::gentoo
sys-devel/llvm:            18.1.8-r1::gentoo
sys-kernel/linux-headers:  6.6-r1::gentoo (virtual/os-headers)
sys-libs/glibc:            2.39-r6::gentoo
Repositories:

gentoo
    location: /var/db/repos/gentoo
    sync-type: rsync
    sync-uri: rsync://rsync.gentoo.org/gentoo-portage
    priority: -1000
    volatile: False
    sync-rsync-extra-opts:
    sync-rsync-verify-metamanifest: yes
    sync-rsync-verify-jobs: 1
    sync-rsync-verify-max-age: 3

steam-overlay
    location: /var/db/repos/steam-overlay
    sync-type: git
    sync-uri: https://github.com/gentoo-mirror/steam-overlay.git
    masters: gentoo
    volatile: False

Binary Repositories:

gentoobinhost
    priority: 1
    sync-uri: https://distfiles.gentoo.org/releases/amd64/binpackages/23.0/x86-64_hardened

ACCEPT_KEYWORDS="amd64"
ACCEPT_LICENSE="@FREE"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-march=native -O2 -pipe"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/lib64/libreoffice/program/sofficerc /usr/share/config /usr/share/gnupg/qualified.txt"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/php/apache2-php8.2/ext-active/ /etc/php/cgi-php8.2/ext-active/ /etc/php/cli-php8.2/ext-active/ /etc/php/fpm-php8.2/ext-active/ /etc/php/phpdbg-php8.2/ext-active/ /etc/revdep-rebuild /etc/sandbox.d /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c"
CXXFLAGS="-march=native -O2 -pipe"
DISTDIR="/var/cache/distfiles"
ENV_UNSET="CARGO_HOME DBUS_SESSION_BUS_ADDRESS DISPLAY GDK_PIXBUF_MODULE_FILE GOBIN GOPATH PERL5LIB PERL5OPT PERLPREFIX PERL_CORE PERL_MB_OPT PERL_MM_OPT XAUTHORITY XDG_CACHE_HOME XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR XDG_STATE_HOME"
FCFLAGS="-march=native -O2 -pipe"
FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs binpkg-multi-instance buildpkg-live config-protect-if-modified distlocks ebuild-locks fixlafiles ipc-sandbox merge-sync merge-wait multilib-strict network-sandbox news parallel-fetch pid-sandbox pkgdir-index-trusted preserve-libs protect-owned qa-unresolved-soname-deps sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr"
FFLAGS="-march=native -O2 -pipe"
GENTOO_MIRRORS="https://mirrors.ircam.fr/pub/gentoo-distfiles/     https://gentoo.mirrors.ovh.net/gentoo-distfiles/     https://mirrors.soeasyto.com/distfiles.gentoo.org/"
LANG="fr_FR.utf8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed -Wl,-z,pack-relative-relocs"
LEX="flex"
MAKEOPTS="-j4"
PKGDIR="/var/cache/binpkgs"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git"
PORTAGE_TMPDIR="/var/tmp"
SHELL="/bin/bash"
USE="X a52 aac acl acpi alsa amd64 bluetooth branding bzip2 cairo cdda cdr cet clamav colord crypt css cups curl cxx dbus dist-kernel dri dts dvd dvdr encode exif fbcon ffmpeg flac fltk gdbm gif gles2 gpm gstreamer gtk gui hardened hddtemp iconv icu ipv6 jack jpeg kf6compat lcms libnotify libtirpc lm-sensors lto lua mad man matroska mng modules modules-compress modules-sign mp3 mp4 mpeg mplayer multilib ncurses networkmanager nls ogg opengl openmp pam pango pcre pdf pic pie png policykit posix ppds profile pulseaudio qt5 qt6 readline samba sasl scanner sdl seccomp sound spell ssl ssp startup-notification svg symlink systemd test-rust tiff truetype udev udisks uefi unicode upower usb vcd vim-syntax vorbis vulkan wayland wxwidgets x264 xattr xcb xft xml xtpax xv xvid zlib" ABI_X86="64" ADA_TARGET="gcc_12" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_anon authn_dbm authn_file authz_dbm authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir env expires ext_filter file_cache filter headers include info log_config logio mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="karbon sheets words" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="mmx mmxext sse sse2 aes avx avx2 f16c fma3 pclmul popcnt rdrand sha sse3 sse4_1 sse4_2 sse4a ssse3" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock greis isync itrax mtk3301 ntrip navcom oceanserver oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 tsip tripmate tnt ublox" GRUB_PLATFORMS="efi-64" GUILE_SINGLE_TARGET="3-0" GUILE_TARGETS="3-0" INPUT_DEVICES="libinput" KERNEL="linux" L10N="fr en" LCD_DEVICES="bayrad cfontz glk hd44780 lb216 lcdm001 mtxorb text" LUA_SINGLE_TARGET="lua5-1" LUA_TARGETS="lua5-1" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php8-2" POSTGRES_TARGETS="postgres15" PYTHON_SINGLE_TARGET="python3_12" PYTHON_TARGETS="python3_12" RUBY_TARGETS="ruby31 ruby32" VIDEO_CARDS="nvidia" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipp2p iface geoip fuzzy condition tarpit sysrq proto logmark ipmark dhcpmac delude chaos account"
Unset:  ADDR2LINE, AR, ARFLAGS, AS, ASFLAGS, CC, CCLD, CONFIG_SHELL, CPP, CPPFLAGS, CTARGET, CXX, CXXFILT, ELFEDIT, EMERGE_DEFAULT_OPTS, EXTRA_ECONF, F77FLAGS, FC, GCOV, GPROF, INSTALL_MASK, LC_ALL, LD, LFLAGS, LIBTOOL, LINGUAS, MAKE, MAKEFLAGS, NM, OBJCOPY, OBJDUMP, PORTAGE_BINHOST, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, PYTHONPATH, RANLIB, READELF, RUSTFLAGS, SIZE, STRINGS, STRIP, YACC, YFLAGS


I use the last stable dist-kernel :

Code:
Linux Mephistopheles 6.6.47-gentoo-dist-hardened #1 SMP PREEMPT_DYNAMIC Tue Sep 10 12:55:05 CEST 2024 x86_64 AMD Ryzen 5 2600 Six-Core Processor AuthenticAMD GNU/Linux


Code:
# eselect kernel list
Available kernel symlink targets:
  [1]   linux-6.6.47-gentoo-dist-hardened *


I have for the VIDEO_CARDS:

Code:
VIDEO_CARDS="nvidia"


For the x11-drivers/nvidia-drivers I have these sets of flags:

equery u "nvidia-drivers":
Code:

[ Legend : U - final flag setting for installation]
[        : I - package is installed with flag     ]
[ Colors : set, unset                             ]
 * Found these USE flags for x11-drivers/nvidia-drivers-550.107.02-r1:
 U I
 + + X                : Add support for X11
 + + abi_x86_32       : 32-bit (x86) libraries
 + + dist-kernel      : Enable subslot rebuilds on Distribution Kernel upgrades
 - - kernel-open      : Use the open source variant of the drivers (Turing/Ampere+ GPUs only, aka GTX 1650+ --
                        recommended with >=560.xx drivers if usable)
 + + modules          : Build the kernel modules
 + + modules-compress : Install compressed kernel modules (if kernel config enables module compression)
 + + modules-sign     : Cryptographically sign installed kernel modules (requires CONFIG_MODULE_SIG=y in the kernel)
 - - persistenced     : Install the persistence daemon for keeping devices state when unused (e.g. for headless)
 - - powerd           : Install the NVIDIA dynamic boost support daemon (only useful with specific laptops, ignore if
                        unsure)
 + + static-libs      : Install the XNVCtrl static library for accessing sensors and other features
 + + strip            : Allow symbol stripping to be performed by the ebuild for special files
 + + tools            : Install additional tools such as nvidia-settings
 + + wayland          : Enable dev-libs/wayland backend


With this as settings in packages.use:

Code:
# cat /etc/portage/package.use/x11-drivers/nvidia-drivers
x11-drivers/nvidia-drivers X dist-kernel tools


Current version of the x11-drivers/nvidia-drivers package:
Code:
[IP-] [  ] x11-drivers/nvidia-drivers-550.107.02-r1:0/550


I added dist-kernel and dbus USE flags in the USE variables too. So far as I understood (which seems wrong), it was enough to rebuild each time it's necessary the module to the kernel.

If I grep for nvidia in dmesg I get:
Code:

[    3.347094] Loading firmware: nvidia/ga104/acr/ucode_ahesasc.bin
[    3.347211] Loading firmware: nvidia/ga104/acr/ucode_asb.bin
[    3.347276] Loading firmware: nvidia/ga104/acr/ucode_unload.bin
[    3.347379] Loading firmware: nvidia/ga104/gr/NET_img.bin
[    3.347543] Loading firmware: nvidia/ga104/gr/fecs_bl.bin
[    3.347572] Loading firmware: nvidia/ga104/gr/fecs_sig.bin
[    3.347605] Loading firmware: nvidia/ga104/gr/gpccs_bl.bin
[    3.347632] Loading firmware: nvidia/ga104/gr/gpccs_sig.bin
[    3.347678] Loading firmware: nvidia/ga104/sec2/sig.bin
[    3.347711] Loading firmware: nvidia/ga104/sec2/image.bin
[    3.347817] Loading firmware: nvidia/ga104/sec2/desc.bin
[    3.347850] Loading firmware: nvidia/ga104/sec2/hs_bl_sig.bin
[    3.348077] Loading firmware: nvidia/ga104/nvdec/scrubber.bin
[    9.432418] nvidia: loading out-of-tree module taints kernel.
[    9.433328] nvidia: module license 'NVIDIA' taints kernel.
[    9.434469] nvidia: module license taints kernel.
(… … … Then it repeats what's below :)
[19392.451740] nvidia-nvlink: Unregistered Nvlink Core, major device number 234
[19394.160559] nvidia-nvlink: Nvlink Core is being initialized, major device number 234
[19394.231736] nvidia-nvlink: Unregistered Nvlink Core, major device number 234
[19395.769877] nvidia-nvlink: Nvlink Core is being initialized, major device number 234
[19395.828465] nvidia-nvlink: Unregistered Nvlink Core, major device number 234
[19397.368033] nvidia-nvlink: Nvlink Core is being initialized, major device number 234
[19397.431732] nvidia-nvlink: Unregistered Nvlink Core, major device number 234


But, somehow, I do not get the API Mismatch error:

Code:
[    0.000000] APIC: Static calls initialized
[    0.000000] ACPI: APIC 0x00000000DCBBBD00 00015E (v03 ALASKA A M I    01072009 AMI  00010013)
[    0.000000] ACPI: Reserving APIC table memory at [mem 0xdcbbbd00-0xdcbbbe5d]
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0xff] high edge lint[0x1])
[    0.000000] IOAPIC[0]: apic_id 13, version 33, address 0xfec00000, GSI 0-23
[    0.000000] IOAPIC[1]: apic_id 14, version 33, address 0xfec01000, GSI 24-55
[    0.000000] APIC: Switch to symmetric I/O mode setup
[    0.000000] APIC: Switched APIC routing to: physical flat
[    0.207718] ACPI: Using IOAPIC for interrupt routing
[    0.238255] pps_core: LinuxPPS API ver. 1 registered
[    8.071171] fuse: init (API version 7.39)
[    8.660727] RAPL PMU: API unit is 2^-32 Joules, 1 fixed counters, 163840 ms ovfl timer


Trying to see if any nvidia module was loaded, and load it if not:
Code:
Mephistopheles ~ # lsmod | grep nvidia
Mephistopheles ~ # modprobe nvidia
modprobe: ERROR: could not insert 'nvidia': No such device


So I guess it start to smell like bad fish ?

Current profile:
Code:
  default/linux/amd64/23.0/hardened/systemd

(Yes, I do not use a plasma/desktop with a… desktop, plus plasma. I mixed up the missing use flag with this current profile, it works, but it was recommanded to me to use another way such as creating my own profile, I don't think it's related but I prefer to say it).

This problem, IIRC, appears with or without a kernel update. 100% sure scenario is like this morning, only needing to update nvidia-driver.

After reading from top to bottom and bottom to top the nvidia-drivers's page from the wiki, I'm not quite sure of what direction I should go.

My guess are these ones:
- I messed up the kernel module signing, I can't remember doing these while installing.
- Maybe my grub isn't regenerated properly and boot up to a kernel without that new module ? Until, somehow, rebuilding the module make it works (but I guess this is wrong).

Heck, even the page about distribution kernel says this:
Code:
If using out-of-source kernel modules like x11-drivers/nvidia-drivers or sys-fs/zfs, add USE="dist-kernel" to /etc/portage/make.conf for automatic rebuilds


Which is here…

What do I miss to make it 100% automatic, thus, having to rebuild stuff my self and reboot ?

Regards,
GASPARD DE RENEFORT Kévin
_________________
Traduction wiki, pour praticiper.
Custom logos/biz card/website.


Last edited by kgdrenefort on Mon Sep 16, 2024 3:04 pm; edited 3 times in total
Back to top
View user's profile Send private message
kgdrenefort
Apprentice
Apprentice


Joined: 19 Sep 2023
Posts: 295
Location: Somewhere in the 77

PostPosted: Mon Sep 16, 2024 12:50 pm    Post subject: Reply with quote

Sadly, rebuilding the modules seems not to have been enough.

I think I simply forget how I was fixing that on my system :(.

Will try to figures out how it was solved before, report here as well (that would help me for sure if I don't have a fix for next weeks).
_________________
Traduction wiki, pour praticiper.
Custom logos/biz card/website.
Back to top
View user's profile Send private message
kgdrenefort
Apprentice
Apprentice


Joined: 19 Sep 2023
Posts: 295
Location: Somewhere in the 77

PostPosted: Mon Sep 16, 2024 2:18 pm    Post subject: Reply with quote

For now I miss the details and can't be more precise, will do later today because I have to leave the computer, but quickly:

I needed to create the following dir:

Code:
mkdir /etc/dracut.conf.d


Inside, I created a nouveau.conf file.

Inside this file, I added: 

Code:
omit_drivers+=nouveau


(Something I might have forgot to tell, yes nvidia module was not listed by lsmod, while nouveau was !).

Then reconfiguring the kernel (no rebuild necessary):

Code:
emerge --config sys-kernel/gentoo-kernel


And after a reboot, works like a charm.

Will try to provides more details on it today.

Regards,
GASPARD DE RENEFORT Kévin
_________________
Traduction wiki, pour praticiper.
Custom logos/biz card/website.
Back to top
View user's profile Send private message
AndrewAmmerlaan
Developer
Developer


Joined: 25 Jun 2014
Posts: 376
Location: Nijmegen

PostPosted: Mon Sep 16, 2024 2:39 pm    Post subject: Reply with quote

Reported upstream as: https://github.com/dracut-ng/dracut-ng/issues/674
_________________
OS: Gentoo 6.8.10-gentoo-dist, ~amd64, 23.0/desktop/plasma/systemd
MB: MSI Z370-A PRO
CPU: Intel Core i9-9900KS
GPU: Intel Arc A770 16GB & Intel UHD Graphics 630
SSD: Samsung 970 EVO Plus 2 TB
RAM: Crucial Ballistix 32GB DDR4-2400
Back to top
View user's profile Send private message
Ionen
Developer
Developer


Joined: 06 Dec 2018
Posts: 2801

PostPosted: Mon Sep 16, 2024 6:22 pm    Post subject: Reply with quote

A workaround was also added to nvidia-drivers, slightly different solution but it should also prevent nouveau from loading the next time it's emerged and initramfs is rebuilt after.
https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=dacc7d5a54fa

Glad to hear it's resolved for you anyhow, was worried something worse was going on when it said "no such device", but it was just nouveau which I tend to not suspect much lately due to the default blacklisting. :)
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum