Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Cannot get ROCM to detect GPU
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
happysmash27
Apprentice
Apprentice


Joined: 28 Mar 2016
Posts: 220

PostPosted: Sat Dec 18, 2021 12:56 am    Post subject: Cannot get ROCM to detect GPU Reply with quote

As AMDGPU-Pro-OpenCL seems to be buggy and inconsistent and even caused a kernel panic recently, I have decided to install ROCM instead.

However, after installing it, clinfo does not seem to detect the graphics card:

Code:
 # clinfo
Number of platforms                               1
  Platform Name                                   AMD Accelerated Parallel Processing
  Platform Vendor                                 Advanced Micro Devices, Inc.
  Platform Version                                OpenCL 2.0 AMD-APP.dbg (3305.0)
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd cl_amd_event_callback
  Platform Extensions function suffix             AMD

  Platform Name                                   AMD Accelerated Parallel Processing
Number of devices                                 0

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  AMD Accelerated Parallel Processing
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   No devices found in platform [AMD Accelerated Parallel Processing?]
  clCreateContext(NULL, ...) [default]            No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  No devices found in platform

ICD loader properties
  ICD loader Name                                 OpenCL ICD Loader
  ICD loader Vendor                               OCL Icd free software
  ICD loader Version                              2.3.0
  ICD loader Profile                              OpenCL 3.0


And RocmInfo says that /dev/kfd is not found:

Code:
 # rocminfo
ROCk module is loaded
Unable to open /dev/kfd read-write: No such file or directory
happysmash27 is member of video group


So, what is /dev/kfd and how do I get it? I grep the kernel ".config" and nothing it mentioned under "KFD" or "kfd". All the options recommended for OpenCL on the wiki (https://wiki.gentoo.org/wiki/OpenCL) appear to already be enabled. dmesg|grep "kfd" and dmesg|grep "KFD" show nothing. So what could it be?

Edit: I am on an AMD Radeon RX 480.
Back to top
View user's profile Send private message
Mistwolf
Apprentice
Apprentice


Joined: 07 Mar 2007
Posts: 189
Location: Edmonton, AB

PostPosted: Sat Dec 18, 2021 2:13 am    Post subject: Reply with quote

As per the ROMC documentation, your video card is not supported.

https://rocmdocs.amd.com/en/latest/Installation_Guide/Installation_new.html#confirm-you-have-a-rocm-capable-gpu
Back to top
View user's profile Send private message
happysmash27
Apprentice
Apprentice


Joined: 28 Mar 2016
Posts: 220

PostPosted: Sat Dec 18, 2021 4:37 am    Post subject: Reply with quote

The wiki claims:

Quote:
The newest OpenCL implementation from AMD is ROCm, Radeon Open Compute, which supports GFX8 and newer GPU chips (Fiji, Polaris, Vega).


While the ROCm Github page says:

Quote:
The following list of GPUs are enabled in the ROCm software, though full support is not guaranteed:

    • GFX8 GPUs
    • "Polaris 11" chips, such as on the AMD Radeon RX 570 and Radeon Pro WX 4100
    • "Polaris 12" chips, such as on the AMD Radeon RX 550 and Radeon RX 540

    • GFX7 GPUs
    • "Hawaii" chips, such as the AMD Radeon R9 390X and FirePro W9100




Since the RX 480 is newer than the R9 390X, this would suggest that the RX 480 would be supported. Was support removed in the newer versions?

However, this:

Quote:
As described in the next section, GFX8 GPUs require PCI Express 3.0 (PCIe 3.0) with support for PCIe atomics. This requires both CPU and motherboard support. GFX9 GPUs require PCIe 3.0 with support for PCIe atomics by default, but they can operate in most cases without this capability.


Along with the supported CPUs list leads me to believe that ROCm would not work with my motherboard anyways, which is a shame. So I have decided to try to get Mesa's OpenCL working instead, which is also broken but for other reasons.

Perhaps the Gentoo wiki should be updated to more accurately reflect the current ROCm support state.
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54821
Location: 56N 3W

PostPosted: Sat Dec 18, 2021 10:16 am    Post subject: Reply with quote

happysmash27,

Most of the Wiki is user contributed documentation. Feel free to contribute.

If you don't want to edit the page you linked, contribute on the discussion page.
The wiki will email other contributors to join in.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
Goverp
Advocate
Advocate


Joined: 07 Mar 2007
Posts: 2202

PostPosted: Sat Dec 18, 2021 11:02 am    Post subject: Reply with quote

happysmash27,

Is your graphics card in the right slot? On my motherboard, only the top slot supports kfd. See here.
_________________
Greybeard
Back to top
View user's profile Send private message
happysmash27
Apprentice
Apprentice


Joined: 28 Mar 2016
Posts: 220

PostPosted: Sun Dec 19, 2021 7:47 am    Post subject: Reply with quote

NeddySeagoon wrote:
happysmash27,

Most of the Wiki is user contributed documentation. Feel free to contribute.

If you don't want to edit the page you linked, contribute on the discussion page.
The wiki will email other contributors to join in.


Indeed, I very well may do that. Thank you for the discussion page suggestion – that is easier than editing it.

Goverp wrote:
happysmash27,

Is your graphics card in the right slot? On my motherboard, only the top slot supports kfd. See here.


My motherboard only has one PCI-E x16 slot so to use another one would not be possible.

Edit: Looking at the issue, it appears to be one I already looked at before posting this one. In addition to this, looking for any dmesg logs with "kfd" in them yields nothing:

Code:
 % dmesg|grep "kfd"
 % dmesg|grep "KFD"
Back to top
View user's profile Send private message
paraw
Apprentice
Apprentice


Joined: 07 Jan 2005
Posts: 169
Location: Stara Zagora (BG)

PostPosted: Sun Dec 19, 2021 8:26 am    Post subject: Reply with quote

I had a similar problem recently (see https://forums.gentoo.org/viewtopic-t-1146191.html).
Do your CPU/MoBo support PCIe Atomics? If not, you can't get ROCM to work.
Back to top
View user's profile Send private message
happysmash27
Apprentice
Apprentice


Joined: 28 Mar 2016
Posts: 220

PostPosted: Sun Dec 19, 2021 8:50 am    Post subject: Reply with quote

Given that my motherboard is from around 2012 and my CPUs are from around 2009, probably not.

Hence, making a new thread on my other issue when I try to use Mesa instead: https://forums.gentoo.org/viewtopic-t-1146321.html?sid=
Back to top
View user's profile Send private message
paraw
Apprentice
Apprentice


Joined: 07 Jan 2005
Posts: 169
Location: Stara Zagora (BG)

PostPosted: Sun Dec 19, 2021 6:16 pm    Post subject: Reply with quote

I'm doing quite fine with the proprietary driver, actually. However, I wonder, how were you installing it? The point is that if you were using the ebuild in portage, then that's quite an outdated driver. I changed the ebuild to install the latest one. If you wish, you may try and see if it gives you any problems. Also, the version installed by my ebuild is the very last one that will support the Polaris10 chip, i.e. the RX480 card, so you may want to save the tarball somewhere in your home directory, in case you need to re-emerge it. Anyway, the ebuild follows. You will need to put it in a local overlay, of course.

Code:
# Copyright 1999-2020 Gentoo Authors
# Distributed under the terms of the GNU General Public License v2

EAPI=7

MULTILIB_COMPAT=( abi_x86_{32,64} )

inherit unpacker multilib-minimal

SUPER_PN='amdgpu-pro'
MY_PV=$(ver_rs 2 '-')

DESCRIPTION="Proprietary OpenCL implementation for AMD GPUs"
HOMEPAGE="https://www.amd.com/en/support/kb/release-notes/rn-amdgpu-unified-linux-21-30"
SRC_URI="${SUPER_PN}-${MY_PV}-ubuntu-20.04.tar.xz"

LICENSE="AMD-GPU-PRO-EULA"
SLOT="0"
KEYWORDS="~amd64 ~x86"

RESTRICT="bindist mirror fetch strip"

BDEPEND="dev-util/patchelf"
COMMON=">=virtual/opencl-3"
DEPEND="${COMMON}"
RDEPEND="${COMMON}
        !media-libs/mesa[opencl]" # Bug #686790

QA_PREBUILT="/opt/amdgpu/lib*/*"

S="${WORKDIR}/${SUPER_PN}-${MY_PV}-ubuntu-20.04"

pkg_nofetch() {
        local pkgver=$(ver_cut 1-2)
        einfo "Please download Radeon Software for Linux version ${pkgver} for Ubuntu 20.04.3 from"
        einfo "    ${HOMEPAGE}"
        einfo "The archive should then be placed into your distfiles directory."
}

src_unpack() {
        default

        local ids_ver="1.0.0"
        local patchlevel=$(ver_cut 3)
        cd "${S}" || die
        unpack_deb "${S}/libdrm-amdgpu-common_${ids_ver}-${patchlevel}_all.deb"
        multilib_parallel_foreach_abi multilib_src_unpack
}

multilib_src_unpack() {
        local libdrm_ver="2.4.106"
        local patchlevel=$(ver_cut 3)
        local deb_abi
        [[ ${ABI} == x86 ]] && deb_abi=i386

        mkdir -p "${BUILD_DIR}" || die
        pushd "${BUILD_DIR}" >/dev/null || die
        unpack_deb "${S}/opencl-orca-amdgpu-pro-icd_${MY_PV}_${deb_abi:-${ABI}}.deb"
        unpack_deb "${S}/libdrm-amdgpu-amdgpu1_${libdrm_ver}-${patchlevel}_${deb_abi:-${ABI}}.deb"
        popd >/dev/null || die
}

multilib_src_install() {
        local dir_abi short_abi
        [[ ${ABI} == x86 ]] && dir_abi=i386-linux-gnu && short_abi=32
        [[ ${ABI} == amd64 ]] && dir_abi=x86_64-linux-gnu && short_abi=64

        into "/opt/amdgpu"
        patchelf --set-rpath '$ORIGIN' "opt/${SUPER_PN}/lib/${dir_abi}"/libamdocl-orca${short_abi}.so || die "Failed to fix library rpath"
        dolib.so "opt/${SUPER_PN}/lib/${dir_abi}"/*
        dolib.so "opt/amdgpu/lib/${dir_abi}"/*

        insinto /etc/OpenCL/vendors
        echo "/opt/amdgpu/$(get_libdir)/libamdocl-orca${short_abi}.so" \
                > "${T}/${SUPER_PN}-${ABI}.icd" || die "Failed to generate ICD file for ABI ${ABI}"
        doins "${T}/${SUPER_PN}-${ABI}.icd"
}

multilib_src_install_all() {
        insinto "/opt/amdgpu"
        doins -r opt/amdgpu/share
}

pkg_postinst() {
        if [[ -z "${REPLACING_VERSIONS}" ]]; then
                ewarn "Please note that using proprietary OpenCL libraries together with the"
                ewarn "Open Source amdgpu stack is not officially supported by AMD. Do not ask them"
                ewarn "for support in case of problems with this package."
                ewarn ""
                ewarn "Furthermore, if you have the whole AMDGPU-Pro stack installed this package"
                ewarn "will almost certainly conflict with it. This might change once AMDGPU-Pro"
                ewarn "has become officially supported by Gentoo."
        fi
}
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum