Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Nvidia Datacenter Driver
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
SkunkMyrddyn
n00b
n00b


Joined: 25 Dec 2024
Posts: 5

PostPosted: Wed Dec 25, 2024 4:04 am    Post subject: Nvidia Datacenter Driver Reply with quote

I'm adding a Nvidia Tesla A2 card to my server to support Cuda / Tensor flow / other AI and compute node acceleration. (the card does not have video out connections)

I am having a difficult time installing the correct driver for the system. The general "nvidia-drivers" package 1) requires X (this is a headless server), and 2) does not list this card as supported (if I'm reading the documentation correctly).

Does anyone know how to get the correct driver(s) installed so that pytorch can recognize the nvidia compute nodes for acceleration?
Back to top
View user's profile Send private message
tiffany
n00b
n00b


Joined: 04 May 2008
Posts: 11

PostPosted: Wed Dec 25, 2024 9:57 am    Post subject: Reply with quote

NVidia's site has a separate section for datacenter drivers. Have you seen them?

I see that they support RHEL, Debian and others.
Back to top
View user's profile Send private message
SkunkMyrddyn
n00b
n00b


Joined: 25 Dec 2024
Posts: 5

PostPosted: Wed Dec 25, 2024 4:38 pm    Post subject: Reply with quote

I checked those out and wasn't sure how to convince gentoo to handle one of the other packaging formats. So I did grab the tarballs they have, which have a nvidia-installer binary; but I can't get that to run either.

I found that it has a --no-x-check that will bypass seeing if X (of some kind) is installed or not.
However, the installer errors out saying it cannot figure out my initramfs. Which makes sense as I am not using an initramfs at all on this system. Nor do I see an option to inform that installer to bypass it.

I feel like I'm missing something basic.
Back to top
View user's profile Send private message
Banana
Moderator
Moderator


Joined: 21 May 2004
Posts: 1802
Location: Germany

PostPosted: Thu Dec 26, 2024 9:12 am    Post subject: Reply with quote

I'm not an expert in this, but there is a nvdia-cuda-toolkit package available: https://packages.gentoo.org/packages/dev-util/nvidia-cuda-toolkit
Maybe this can help.

Also, what happens if you install https://packages.gentoo.org/packages/x11-drivers/nvidia-drivers and have set -X as a useflag?
_________________
Forum Guidelines

PFL - Portage file list - find which package a file or command belongs to.
My delta-labs.org snippets do expire


Last edited by Banana on Thu Dec 26, 2024 9:39 pm; edited 1 time in total
Back to top
View user's profile Send private message
SkunkMyrddyn
n00b
n00b


Joined: 25 Dec 2024
Posts: 5

PostPosted: Thu Dec 26, 2024 10:11 am    Post subject: Reply with quote

The nvidia-cuda-toolkit doesn't install a driver, so python torch does not find any cuda devices.

With -X set as a USE flag blocks the x11-drivers/nvidia-drivers from installing.
Back to top
View user's profile Send private message
Hu
Administrator
Administrator


Joined: 06 Mar 2007
Posts: 22876

PostPosted: Thu Dec 26, 2024 12:02 pm    Post subject: Reply with quote

SkunkMyrddyn wrote:
With -X set as a USE flag blocks the x11-drivers/nvidia-drivers from installing.
Please show the output that led to this statement. I do not see that result here:
Code:
# USE=-X emerge -pv nvidia-drivers

These are the packages that would be merged, in order:

Calculating dependencies... done!
Dependency resolution took 2.59 s (backtrack: 0/20).

...
[ebuild  N     ] x11-drivers/nvidia-drivers-550.135:0/550::gentoo  USE="modules strip tools -X -dist-kernel -kernel-open -modules-compress -modules-sign -persistenced -powerd -static-libs -wayland" ABI_X86="(64) -32" 314787 KiB
Back to top
View user's profile Send private message
Ionen
Developer
Developer


Joined: 06 Dec 2018
Posts: 2885

PostPosted: Thu Dec 26, 2024 5:00 pm    Post subject: Reply with quote

For nvidia-drivers on a headless setup, usually you'll want USE="persistenced -X -static-libs -wayland -tools" on it (and enable persistenced w/ systemd or openrc, this prevent the card from getting uninitialized when there isn't a display constantly using it).

wrt USE=-tools, that's for nvidia-settings which is a GUI application, so likely don't want that either. It does have some command line usage but is very limited without X given it uses it to talk to the card (imagine nvidia plans to migrate its feature to rely on NVML in the future).

As for USE=-static-libs, that's for libXNVCtrl.a which requires xorg headers at build time. Library is not useful if not using X. If another package depends on nvidia-drivers having static-libs enabled, may want to try USE=-video_cards_nvidia on that package, the feature won't be useful headless.

Should let you avoid about all X/wayland stuff, albeit I wouldn't overly stress about these even if unused, it's pretty small dependencies as long as don't start pulling the bigger GUI toolkits.
Back to top
View user's profile Send private message
SkunkMyrddyn
n00b
n00b


Joined: 25 Dec 2024
Posts: 5

PostPosted: Thu Dec 26, 2024 6:19 pm    Post subject: Reply with quote

Hu wrote:
SkunkMyrddyn wrote:
With -X set as a USE flag blocks the x11-drivers/nvidia-drivers from installing.
Please show the output that led to this statement. I do not see that result here:
Code:
# USE=-X emerge -pv nvidia-drivers

These are the packages that would be merged, in order:

Calculating dependencies... done!
Dependency resolution took 2.59 s (backtrack: 0/20).

...
[ebuild  N     ] x11-drivers/nvidia-drivers-550.135:0/550::gentoo  USE="modules strip tools -X -dist-kernel -kernel-open -modules-compress -modules-sign -persistenced -powerd -static-libs -wayland" ABI_X86="(64) -32" 314787 KiB


USE=-X emerge -pv nvidia-drivers

These are the packages that would be merged, in order:

Calculating dependencies... done!
Dependency resolution took 8.09 s (backtrack: 0/20).

[ebuild N ] x11-themes/hicolor-icon-theme-0.17::gentoo 0 KiB
[ebuild N ] x11-libs/libXv-1.0.13::gentoo USE="-doc" ABI_X86="(64) -32 (-x 32)" 275 KiB
[ebuild N ] x11-libs/libXcomposite-0.4.6::gentoo USE="-doc" ABI_X86="(64) -32 (-x32)" 0 KiB
[ebuild N ] x11-libs/libXcursor-1.2.3::gentoo USE="-doc" ABI_X86="(64) -32 (-x32)" 286 KiB
[ebuild N ] x11-libs/libXdamage-1.1.6::gentoo ABI_X86="(64) -32 (-x32)" 0 KiB
[ebuild N ] dev-libs/jansson-2.14-r2:0/4::gentoo USE="-doc -static-libs" 0 KiB
[ebuild N ] dev-util/gdbus-codegen-2.82.4::gentoo PYTHON_SINGLE_TARGET="py thon3_12 -python3_10 -python3_11 -python3_13" 0 KiB
[ebuild N ] dev-lang/vala-0.56.17:0.56::gentoo USE="-test -valadoc" 0 KiB
[ebuild N ] virtual/linux-sources-3-r8::gentoo USE="-firmware" 0 KiB
[ebuild N ] x11-libs/gdk-pixbuf-2.42.12:2::gentoo USE="gif introspection j peg -gtk-doc -test -tiff" ABI_X86="(64) -32 (-x32)" 0 KiB
[ebuild N ] sys-apps/dbus-1.15.8::gentoo USE="-X -debug -doc -elogind (-se linux) -static-libs -systemd -test -valgrind" ABI_X86="(64) -32 (-x32)" 0 KiB
[ebuild N ] dev-libs/fribidi-1.0.13::gentoo USE="-doc -test" ABI_X86="(64) -32 (-x32)" 0 KiB
[ebuild N ] x11-libs/libvdpau-1.5::gentoo USE="-doc -dri -test" ABI_X86="( 64) -32 (-x32)" 0 KiB
[ebuild N ] media-libs/libepoxy-1.5.10-r3::gentoo USE="X -test" ABI_X86="( 64) -32 (-x32)" 0 KiB
[ebuild R ] x11-libs/cairo-1.18.2-r1::gentoo USE="X* glib (-aqua) (-debug) -gtk-doc -test" ABI_X86="(64) -32 (-x32)" 0 KiB
[ebuild N ] x11-libs/pango-1.52.2::gentoo USE="introspection -X -debug -sy sprof -test" ABI_X86="(64) -32 (-x32)" 0 KiB
[ebuild N ] app-accessibility/at-spi2-core-2.52.0:2::gentoo USE="introspec tion -X -dbus-broker -gtk-doc -systemd -test" ABI_X86="(64) -32 (-x32)" 0 KiB
[ebuild N ] dev-util/gtk-update-icon-cache-3.24.42::gentoo 0 KiB
[ebuild N ] gnome-base/librsvg-2.58.5:2::gentoo USE="introspection vala -d ebug -gtk-doc" ABI_X86="(64) -32 (-x32)" 6246 KiB
[ebuild N ] x11-libs/gtk+-3.24.42-r1:3::gentoo USE="X introspection (-aqua ) -broadway -cloudproviders -colord -cups -examples -gtk-doc -sysprof -test -vim -syntax -wayland -xinerama" ABI_X86="(64) -32 (-x32)" 0 KiB
[ebuild N ] x11-themes/adwaita-icon-theme-legacy-46.2::gentoo 0 KiB
[ebuild N ] x11-themes/adwaita-icon-theme-46.2::gentoo USE="-branding" 0 K iB
[ebuild N ] dev-util/vulkan-headers-1.3.296.0::gentoo 0 KiB
[ebuild N ] dev-util/pahole-1.27-r1::gentoo USE="-debug -verify-sig" PYTHO N_SINGLE_TARGET="python3_12 -python3_10 -python3_11 -python3_13" 0 KiB
[ebuild N ] x11-drivers/nvidia-drivers-565.77:0/565::gentoo USE="modules s tatic-libs strip tools -X -dist-kernel -kernel-open -modules-compress -modules-s ign -persistenced -powerd -wayland" ABI_X86="(64) -32" 347766 KiB

Total: 25 packages (24 new, 1 reinstall), Size of downloads: 354572 KiB

The following USE changes are necessary to proceed:
(see "package.use" in the portage(5) man page for more details)
# required by x11-drivers/nvidia-drivers-565.77::gentoo[tools]
# required by nvidia-drivers (argument)
>=x11-libs/gtk+-3.24.42-r1 X
# required by x11-libs/gtk+-3.24.42-r1::gentoo
# required by x11-themes/adwaita-icon-theme-legacy-46.2::gentoo
# required by x11-themes/adwaita-icon-theme-46.2::gentoo
>=media-libs/libepoxy-1.5.10-r3 X
# required by x11-libs/gtk+-3.24.42-r1::gentoo
# required by x11-themes/adwaita-icon-theme-legacy-46.2::gentoo
# required by x11-themes/adwaita-icon-theme-46.2::gentoo
>=x11-libs/cairo-1.18.2-r1 X

emerge: there are no ebuilds built with USE flags to satisfy "x11-libs/gtk+:3[X] ".
!!! One of the following packages is required to complete your request:
- x11-libs/gtk+-3.24.41-r1::gentoo (Change USE: +X)
(dependency required by "x11-drivers/nvidia-drivers-565.77::gentoo[tools]" [ebui ld])
(dependency required by "nvidia-drivers" [argument])

[Administrator edit: unchecked Disable BBCode in this post so that OP's quote tags work. -Hu]
Back to top
View user's profile Send private message
SkunkMyrddyn
n00b
n00b


Joined: 25 Dec 2024
Posts: 5

PostPosted: Thu Dec 26, 2024 6:21 pm    Post subject: Reply with quote

Ionen wrote:
For nvidia-drivers on a headless setup, usually you'll want USE="persistenced -X -static-libs -wayland -tools" on it (and enable persistenced w/ systemd or openrc, this prevent the card from getting uninitialized when there isn't a display constantly using it).

wrt USE=-tools, that's for nvidia-settings which is a GUI application, so likely don't want that either. It does have some command line usage but is very limited without X given it uses it to talk to the card (imagine nvidia plans to migrate its feature to rely on NVML in the future).

As for USE=-static-libs, that's for libXNVCtrl.a which requires xorg headers at build time. Library is not useful if not using X. If another package depends on nvidia-drivers having static-libs enabled, may want to try USE=-video_cards_nvidia on that package, the feature won't be useful headless.

Should let you avoid about all X/wayland stuff, albeit I wouldn't overly stress about these even if unused, it's pretty small dependencies as long as don't start pulling the bigger GUI toolkits.


USE="persistenced -X -static-libs -wayland -tools" emerge -pv nvidia-drivers

These are the packages that would be merged, in order:

Calculating dependencies... done!
Dependency resolution took 2.86 s (backtrack: 0/20).

[ebuild N ] acct-user/nvpd-0-r2::gentoo 0 KiB
[ebuild N ] dev-util/pahole-1.27-r1::gentoo USE="-debug -verify-sig" PYTHON_SINGLE_TARGET="python3_12 -python3_10 -python3_11 -python3_13" 0 KiB
[ebuild N ] virtual/linux-sources-3-r8::gentoo USE="-firmware" 0 KiB
[ebuild N ] x11-drivers/nvidia-drivers-565.77:0/565::gentoo USE="modules persistenced strip -X -dist-kernel -kernel-open -modules-compress -modules-sign -powerd -static-libs -tools -wayland" ABI_X86="(64) -32" 347766 KiB

Total: 4 packages (4 new), Size of downloads: 347766 KiB

Looks like that set is allowing it to build. Running it and will see if pytorch will see the card.

[Administrator edit: unchecked Disable BBCode in this post so that OP's quote tags work. -Hu]
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum