Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Plasma 6.1 wayland nVidia, let's code something
View unanswered posts
View posts from last 24 hours
View posts from last 7 days

 
Reply to topic    Gentoo Forums Forum Index Desktop Environments
View previous topic :: View next topic  
Author Message
Ridrok
Tux's lil' helper
Tux's lil' helper


Joined: 26 Jan 2014
Posts: 103
Location: France

PostPosted: Fri Sep 13, 2024 7:57 am    Post subject: Plasma 6.1 wayland nVidia, let's code something Reply with quote

Hello,

I migrated to Plasma 6.1 yesterday, gave a try to Wayland but had to switch back to X11 due to nVidia settings not able to control fans on Wayland.
On X11, each desktop effect freeze de system 1 second, calling spectacle requires 2 seconds, pressing on square region also freeze the system for 2 seconds. Putting a window full screen is lagy.
My system is quite high end:
- Ryzen 9 5900X
- 64 GB of DDR4 memory
- nVidia RTX 3090-TI OC
- NVME SDD

I am using the unstable driver 560.

After digging, I did reduce the vm cache and this improved marginally.
/etc/sysctl.conf
Code:
# less cache memory from 6.4GB to 3.2GB
vm.dirty_ratio = 5
# write to disk when 1% so 640Mo
vm.dirty_background_ratio = 1


So I think I have to switch to wayland, X11 is no more a choice with nVidia proprietary, Plasma 5 was fine on X11, 6.1 is not.

nVidia did implement more and more NVLM API on the driver, so there is a way to control the Graphic card from wayland because NVLM is not relying on X11 libs.
https://docs.nvidia.com/deploy/nvml-api/change-log.html#change-log

If someone can help me start on coding with QT6 I would appreciate. There is already a python interface, but I would prefer to make this in C++. I already took a look a nVidia setting, but the project is large and I struggle to collect all the necessary .h and .c content to have the set of NVLM functions to start with.
https://github.com/NVIDIA/nvidia-settings/blob/main/src/nvml.h

I already wrote an application to control fans with an hard-coded curve few years back for QT5, now migrated to QT6, but it does use the X11 nVidia lib, so no Wayland.
https://pixel.rodrik.ch/NvControl.png

Ridrok


Last edited by Ridrok on Fri Sep 13, 2024 9:34 am; edited 1 time in total
Back to top
View user's profile Send private message
logrusx
Advocate
Advocate


Joined: 22 Feb 2018
Posts: 2204

PostPosted: Fri Sep 13, 2024 9:22 am    Post subject: Re: Plasma 6.1 wayland nVidia, let's code something Reply with quote

Ridrok wrote:

On X11, each desktop effect freeze de system 1 second, calling spectacle requires 2 seconds, pressing on square region also freeze the system for 2 seconds. Putting a window full screen is lagy.


Did you check your Xorg logs if it in fact used nVidia and not something like VESA?

Ridrok wrote:
So I think I have to switch to wayland, X11 is no more a choice with nVidia proprietary.


That's hard to believe. If that was the case, the forums would be flooded already.

Ridrok wrote:
If someone can help me start on coding with QT6 I would appreciate.


I don't want to discourage you, but this is a hard route and I don't think coding with QT6 will solve your fans problem. I think you didn't in fact use the nVidia driver. I have a laptop and can't verify if nvidia-settings can control fans but I'm pretty sure it's functional under Wayland and the fact its interface provides fewer configuration options is due to the fact many of them were Xorg specific under Xorg.

Please verify you were indeed using the nVidia driver.

Best Regards,
Georgi
Back to top
View user's profile Send private message
Ridrok
Tux's lil' helper
Tux's lil' helper


Joined: 26 Jan 2014
Posts: 103
Location: France

PostPosted: Fri Sep 13, 2024 9:30 am    Post subject: Re: Plasma 6.1 wayland nVidia, let's code something Reply with quote

logrusx wrote:
Please verify you were indeed using the nVidia driver.


It does
Code:
    28.442] (II) Loading /usr/lib64/xorg/modules/extensions/libglxserver_nvidia.so
[    28.596] (II) Module glxserver_nvidia: vendor="NVIDIA Corporation"
[    28.596]    compiled for 1.6.99.901, module version = 1.0.0
[    28.596]    Module class: X.Org Server Extension
[    28.596] (II) NVIDIA GLX Module  560.35.03  Fri Aug 16 21:27:48 UTC 2024
[    28.597] (II) NVIDIA: The X server supports PRIME Render Offload.
[    28.603] (--) NVIDIA(0): Valid display device(s) on GPU-0 at PCI:10:0:0
[    28.603] (--) NVIDIA(0):     DFP-0
[    28.603] (--) NVIDIA(0):     DFP-1
[    28.603] (--) NVIDIA(0):     DFP-2
[    28.603] (--) NVIDIA(0):     DFP-3
[    28.603] (--) NVIDIA(0):     DFP-4
[    28.603] (--) NVIDIA(0):     DFP-5 (boot)
[    28.603] (--) NVIDIA(0):     DFP-6
[    28.624] (II) NVIDIA(0): NVIDIA GPU NVIDIA GeForce RTX 3090 (GA102-A) at PCI:10:0:0
[    28.624] (II) NVIDIA(0):     (GPU-0)
[    28.624] (--) NVIDIA(0): Memory: 25165824 kBytes
[    28.624] (--) NVIDIA(0): VideoBIOS: 94.02.42.80.9f


And I have to add, my GPU is a poor one from Zotac, if I don't control the fan better than the card does, when I run AI the GPU reach 90°C and I loose the display. Have to stop the PC and let the card cool.
With custom curve, I do not go beyond 67/68°C.
Back to top
View user's profile Send private message
Ridrok
Tux's lil' helper
Tux's lil' helper


Joined: 26 Jan 2014
Posts: 103
Location: France

PostPosted: Fri Sep 13, 2024 12:50 pm    Post subject: Reply with quote

Let's Begin, using nVidia settings app from github.

From NvCtrlAttributesPrivate.h I found this
Code:
typedef struct __NvCtrlNvmlAttributes NvCtrlNvmlAttributes;

struct __NvCtrlNvmlAttributes {
    struct {
        void *handle;

        typeof(nvmlInit)                                (*Init);
        typeof(nvmlShutdown)                            (*Shutdown);
        typeof(nvmlDeviceGetHandleByIndex)              (*DeviceGetHandleByIndex);
        typeof(nvmlDeviceGetUUID)                       (*DeviceGetUUID);
        typeof(nvmlDeviceGetCount)                      (*DeviceGetCount);
        typeof(nvmlDeviceGetTemperature)                (*DeviceGetTemperature);
        typeof(nvmlDeviceGetName)                       (*DeviceGetName);
        typeof(nvmlDeviceGetVbiosVersion)               (*DeviceGetVbiosVersion);
        typeof(nvmlDeviceGetMemoryInfo)                 (*DeviceGetMemoryInfo);
        typeof(nvmlDeviceGetMemoryInfo_v2)              (*DeviceGetMemoryInfo_v2);
        typeof(nvmlDeviceGetPciInfo)                    (*DeviceGetPciInfo);
        typeof(nvmlDeviceGetCurrPcieLinkWidth)          (*DeviceGetCurrPcieLinkWidth);
        typeof(nvmlDeviceGetMaxPcieLinkGeneration)      (*DeviceGetMaxPcieLinkGeneration);
        typeof(nvmlDeviceGetMaxPcieLinkWidth)           (*DeviceGetMaxPcieLinkWidth);
        typeof(nvmlDeviceGetVirtualizationMode)         (*DeviceGetVirtualizationMode);
        typeof(nvmlDeviceGetGridLicensableFeatures)     (*DeviceGetGridLicensableFeatures);
        typeof(nvmlDeviceGetGspFirmwareMode)            (*DeviceGetGspFirmwareMode);
        typeof(nvmlDeviceGetUtilizationRates)           (*DeviceGetUtilizationRates);
        typeof(nvmlDeviceGetTemperatureThreshold)       (*DeviceGetTemperatureThreshold);
        typeof(nvmlDeviceGetFanSpeed_v2)                (*DeviceGetFanSpeed_v2);
        typeof(nvmlSystemGetDriverVersion)              (*SystemGetDriverVersion);
        typeof(nvmlDeviceGetEccMode)                    (*DeviceGetEccMode);
        typeof(nvmlDeviceGetDefaultEccMode)             (*DeviceGetDefaultEccMode);
        typeof(nvmlDeviceSetEccMode)                    (*DeviceSetEccMode);
        typeof(nvmlDeviceGetTotalEccErrors)             (*DeviceGetTotalEccErrors);
        typeof(nvmlDeviceClearEccErrorCounts)           (*DeviceClearEccErrorCounts);
        typeof(nvmlDeviceGetMemoryErrorCounter)         (*DeviceGetMemoryErrorCounter);
        typeof(nvmlSystemGetNVMLVersion)                (*SystemGetNVMLVersion);
        typeof(nvmlDeviceGetNumGpuCores)                (*DeviceGetNumGpuCores);
        typeof(nvmlDeviceGetMemoryBusWidth)             (*DeviceGetMemoryBusWidth);
        typeof(nvmlDeviceGetIrqNum)                     (*DeviceGetIrqNum);
        typeof(nvmlDeviceGetPowerSource)                (*DeviceGetPowerSource);
        typeof(nvmlDeviceGetNumFans)                    (*DeviceGetNumFans);
        typeof(nvmlDeviceSetFanSpeed_v2)                (*DeviceSetFanSpeed_v2);
        typeof(nvmlDeviceGetTargetFanSpeed)             (*DeviceGetTargetFanSpeed);
        typeof(nvmlDeviceGetMinMaxFanSpeed)             (*DeviceGetMinMaxFanSpeed);
        typeof(nvmlDeviceSetFanControlPolicy)           (*DeviceSetFanControlPolicy);
        typeof(nvmlDeviceGetFanControlPolicy_v2)        (*DeviceGetFanControlPolicy_v2);
        typeof(nvmlDeviceSetDefaultFanSpeed_v2)         (*DeviceSetDefaultFanSpeed_v2);
        typeof(nvmlDeviceGetPowerUsage)                 (*DeviceGetPowerUsage);
        typeof(nvmlDeviceGetPowerManagementDefaultLimit)     (*DeviceGetPowerManagementDefaultLimit);
        typeof(nvmlDeviceGetPowerManagementLimitConstraints) (*DeviceGetPowerManagementLimitConstraints);

    } lib;

    unsigned int deviceIdx; /* XXX Needed while using NV-CONTROL as fallback */
    unsigned int deviceCount;
    unsigned int sensorCount;
    unsigned int *sensorCountPerGPU;
    unsigned int coolerCount;
    unsigned int *coolerCountPerGPU;
};

from NvCtrlAttributesNvml.c I found this
Code:
static Bool LoadNvml(NvCtrlNvmlAttributes *nvml)
{
    enum {
        _OPTIONAL,
        _REQUIRED
    };

    nvmlReturn_t ret;

    nvml->lib.handle = dlopen("libnvidia-ml.so.1", RTLD_LAZY);

    if (nvml->lib.handle == NULL) {
        goto fail;
    }

#define STRINGIFY_SYMBOL(_symbol) #_symbol

#define EXPAND_STRING(_symbol) STRINGIFY_SYMBOL(_symbol)

#define GET_SYMBOL(_required, _proc)                                           \
    nvml->lib._proc = dlsym(nvml->lib.handle, "nvml" STRINGIFY_SYMBOL(_proc)); \
    nvml->lib._proc = dlsym(nvml->lib.handle, EXPAND_STRING(nvml ## _proc));   \
    if (nvml->lib._proc == NULL) {                                             \
        if (_required) {                                                       \
            goto fail;                                                         \
        } else {                                                               \
            nvml->lib._proc = (void*) NvmlStubFunction;                        \
        }                                                                      \
    }

    GET_SYMBOL(_REQUIRED, Init);
    GET_SYMBOL(_REQUIRED, Shutdown);
    GET_SYMBOL(_REQUIRED, DeviceGetHandleByIndex);
    GET_SYMBOL(_REQUIRED, DeviceGetUUID);
    GET_SYMBOL(_REQUIRED, DeviceGetCount);
    GET_SYMBOL(_REQUIRED, DeviceGetTemperature);
    GET_SYMBOL(_REQUIRED, DeviceGetName);
    GET_SYMBOL(_REQUIRED, DeviceGetVbiosVersion);
    GET_SYMBOL(_REQUIRED, DeviceGetMemoryInfo);
etc....


I think it's a good start to do a bit of NVLM since it will map the functions to the library api.
Back to top
View user's profile Send private message
Ionen
Developer
Developer


Joined: 06 Dec 2018
Posts: 2800

PostPosted: Fri Sep 13, 2024 1:53 pm    Post subject: Reply with quote

Honestly surprised that there's nothing for this yet afaik. I did write a tiny fan control daemon myself given I didn't like the default curve but it (also) uses libXNVCtrl

fwiw closed-source nvidia-smi (which heavily relies on nvml and dlopens libnvidia-ml) can be used to set a few things and see fan speed without X/wayland but I don't think it let you control the fan itself (it'd be horrible to fork nvidia-smi every time want to probe/set the fan speed to have a proper curve anyway).

I do hope nvidia will eventually improve nvidia-settings and the supporting libraries anyhow (and documentation to use them).


Last edited by Ionen on Fri Sep 13, 2024 2:05 pm; edited 1 time in total
Back to top
View user's profile Send private message
logrusx
Advocate
Advocate


Joined: 22 Feb 2018
Posts: 2204

PostPosted: Fri Sep 13, 2024 2:04 pm    Post subject: Reply with quote

I'm still skeptical about poor performance on Xorg/NVIDIA.

What do tools like glxinfo say when run from a terminal under KDE?

I would go through the relevant wiki pages and also through arch wiki to see if I can fix it. Obviously you need fan control asap.

Another idea is to run a separate Xorg session from where you can set the custom curve with nvidia-settings for the time being.

Best Regards,
Georgi
Back to top
View user's profile Send private message
Ionen
Developer
Developer


Joined: 06 Dec 2018
Posts: 2800

PostPosted: Fri Sep 13, 2024 2:08 pm    Post subject: Reply with quote

logrusx wrote:
I'm still skeptical about poor performance on Xorg/NVIDIA.
But yeah there's that, it works just fine for about everyone including myself. I feel like there may be something else going on if things are noticeably slow (like using llvmpipe rather than nvidia).

glxinfo would be interesting indeed.
Back to top
View user's profile Send private message
Ridrok
Tux's lil' helper
Tux's lil' helper


Joined: 26 Jan 2014
Posts: 103
Location: France

PostPosted: Fri Sep 13, 2024 2:17 pm    Post subject: Reply with quote

Ionen wrote:
glxinfo would be interesting indeed.

Here you go
https://pastebin.com/6Q5BeG9a

To tell a bit more about how it behaves, for example if I watch a YT video and discord makes a popup, I get a freeze in the video for less than a second, but it's noticeable. Never had this under X11 with KDE 4 then Plasma 5. (Yes I am on Gentoo for so many years).

I have red tons of pages already, most are talking about QML cache, I did not see a way to fix this, even checked the source of kwin main.cpp if the patch to not cache was in the source.
Back to top
View user's profile Send private message
logrusx
Advocate
Advocate


Joined: 22 Feb 2018
Posts: 2204

PostPosted: Fri Sep 13, 2024 2:39 pm    Post subject: Reply with quote

Ridrok wrote:
Ionen wrote:
glxinfo would be interesting indeed.

Here you go
https://pastebin.com/6Q5BeG9a


Code:
direct rendering: Yes


That much I can tell. Unfortunately is there's something else I'm not the guy to spot it.

Ridrok wrote:
To tell a bit more about how it behaves, for example if I watch a YT video and discord makes a popup, I get a freeze in the video for less than a second, but it's noticeable. Never had this under X11 with KDE 4 then Plasma 5. (Yes I am on Gentoo for so many years).

I have red tons of pages already, most are talking about QML cache, I did not see a way to fix this, even checked the source of kwin main.cpp if the patch to not cache was in the source.


Maybe upload Xorg log for those who can read it. Again I'm not the guy. I just hope someone will open it even out of curiosity and spot something.

Best Regards,
Georgi

p.s. try the idea to set the curve from an Xorg session. It would be nice if you can set it once and log off, if not, you can just abandon the session and start a Wayland one.
p.s.2 I assume you don't have multiple monitors?
p.s.3 this is a long shot but: https://forum.endeavouros.com/t/kde-stutters-whenever-i-open-close-move-resize-a-window/56237
Quote:
found out the problem’s cause. the window decorations i was using (psion). who would’ve guessed.
Back to top
View user's profile Send private message
Ridrok
Tux's lil' helper
Tux's lil' helper


Joined: 26 Jan 2014
Posts: 103
Location: France

PostPosted: Fri Sep 13, 2024 3:17 pm    Post subject: Reply with quote

Here is the Xorg log
https://pastebin.com/57Ras7tG
Back to top
View user's profile Send private message
Ridrok
Tux's lil' helper
Tux's lil' helper


Joined: 26 Jan 2014
Posts: 103
Location: France

PostPosted: Fri Sep 13, 2024 3:22 pm    Post subject: Reply with quote

logrusx wrote:

p.s. try the idea to set the curve from an Xorg session. It would be nice if you can set it once and log off, if not, you can just abandon the session and start a Wayland one.
p.s.2 I assume you don't have multiple monitors?
p.s.3 this is a long shot but: https://forum.endeavouros.com/t/kde-stutters-whenever-i-open-close-move-resize-a-window/56237
Quote:
found out the problem’s cause. the window decorations i was using (psion). who would’ve guessed.

for pt 1, will try this, but I do progress on the NVML part
for pt 2, I have only a 30" Dell UP3017 Screen

I have a decoration, have to check this. it's aero-cielo a Vista theme, that may be the cause.
Back to top
View user's profile Send private message
Ridrok
Tux's lil' helper
Tux's lil' helper


Joined: 26 Jan 2014
Posts: 103
Location: France

PostPosted: Fri Sep 13, 2024 8:14 pm    Post subject: Reply with quote

Removing the previous decoration fixed the problem on X11.

Also progressed a bit on NVML, I have the NVML code not called but compiling with QT6.
I cannot progress much until next week, will keep you posted.
Back to top
View user's profile Send private message
Ridrok
Tux's lil' helper
Tux's lil' helper


Joined: 26 Jan 2014
Posts: 103
Location: France

PostPosted: Sat Sep 14, 2024 9:04 am    Post subject: Reply with quote

I am proud I managed to do some pure nvml directly from a QT6 program.

This is what code outputs as debug information

Code:
Init ret: 0
Count function: 1816591216
Card Count: 1
Card 0 name: NVIDIA GeForce RTX 3090
Back to top
View user's profile Send private message
Ridrok
Tux's lil' helper
Tux's lil' helper


Joined: 26 Jan 2014
Posts: 103
Location: France

PostPosted: Sat Sep 14, 2024 7:54 pm    Post subject: Reply with quote

I did it !

Fan control in QT without any dependency in X, direct NVML with 2 processes. Main application runs as user to do all the queries and display. A small process with chmod u+s is launched by the QT application. And then the QT main application sends command to the other process which in turn do the writes on the driver.

I will cleanup code a lot to put a template application on github.
there is nothing preventing from overclocking the card too. No coolbit needed.

Ridrok
Back to top
View user's profile Send private message
Ionen
Developer
Developer


Joined: 06 Dec 2018
Posts: 2800

PostPosted: Sat Sep 14, 2024 8:11 pm    Post subject: Reply with quote

Nice, good to hear it's possible.
Back to top
View user's profile Send private message
Ridrok
Tux's lil' helper
Tux's lil' helper


Joined: 26 Jan 2014
Posts: 103
Location: France

PostPosted: Sat Sep 14, 2024 9:25 pm    Post subject: Reply with quote

Picture
https://pixel.rodrik.ch/Wayland.png
Back to top
View user's profile Send private message
Ridrok
Tux's lil' helper
Tux's lil' helper


Joined: 26 Jan 2014
Posts: 103
Location: France

PostPosted: Mon Sep 16, 2024 9:33 am    Post subject: Reply with quote

Hello,

This is what I came up to:
https://github.com/Neo2003/nVidia-NVML-QT5-6-tools

I hope you find it useful.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Desktop Environments All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum