View previous topic :: View next topic |
Author |
Message |
mcnutty Tux's lil' helper
Joined: 29 Dec 2009 Posts: 130
|
Posted: Thu Apr 20, 2023 8:33 pm Post subject: Resume from sleep problem (with Nvidia, Xorg, KDE Plasma) |
|
|
Resuming from sleep often results in a broken/unusable system. I've been using an nvidia card with nvidia-drivers, xorg and KDE/Plasma with multiple monitors for a long time. The exact card, kernel, and software versions have varied, but I have had this problem for many years. My previous cards/systems didn't use all that much power when idling so I basically just gave up trying to solve the issue. However, I recently got a new graphics card that uses significantly more power at idle so I thought I'd try again, but I still run into the problem and it prevents me from being able to put the system to sleep, which is now a larger concern.
I would say about 50-75% of the time the system wakes up and everything is fine. However, the other 25% there is a significant problem that requires a reboot. Sometimes the system just doesn't ever really wake up. The computer appears to wake up, but the monitors never turn on and the system basically just hangs. Other times the whole system wakes up and all the monitors turn on. I'm able to unlock the screens and get back in, but the main panel is gone and other parts of the GUI don't seem to work properly, for example using the scroll wheel on a blank area does not switch virtual desktops like usual.
I've seen threads like these, but none seem to point to a solution that works for me.
https://forums.gentoo.org/viewtopic-t-1155841-highlight-nvidia+suspend.html
https://forums.gentoo.org/viewtopic-t-1133268-highlight-.html
https://bugs.gentoo.org/693384
https://bugs.kde.org/show_bug.cgi?id=356727
Any ideas on where the root problem is, where to look for solutions, or where to file a bug report? |
|
Back to top |
|
|
pietinger Moderator
Joined: 17 Oct 2006 Posts: 5084 Location: Bavaria
|
Posted: Thu Apr 20, 2023 8:55 pm Post subject: |
|
|
mcnutty,
with all what I know about kernel and nvidia I can give only these suggestions:
a) Dont use nvidia cards, or
b) Dont use Suspend, or
c) Wait for new kernel and nvida driver versions.
Sorry, no joking here |
|
Back to top |
|
|
mcnutty Tux's lil' helper
Joined: 29 Dec 2009 Posts: 130
|
Posted: Thu Apr 20, 2023 9:06 pm Post subject: |
|
|
Yeah, I was afraid of that, but not exactly unexpected. Unfortunately, I need the card for machine learning so nvidia is basically a requirement. I'll continue to stick with (b) for now and hope that pigs will fly someday. |
|
Back to top |
|
|
logrusx Advocate
Joined: 22 Feb 2018 Posts: 2397
|
Posted: Fri Apr 21, 2023 5:42 am Post subject: |
|
|
This is a very generic question.
Please provide details for your system.
For the short while I used the dGPU on my laptop it worked quite fine with 5.15 kernel and 510 driver versions. One thing is it needed the nvidia-suspend/resume/hibernate services turned on to do some stuff before sleep and after wake up.
I believe systemd is useful here to trigger all that, I don't know how openrc performs.
Best Regards,
Georgi
Last edited by logrusx on Mon Sep 16, 2024 3:02 pm; edited 1 time in total |
|
Back to top |
|
|
rab0171610 Guru
Joined: 24 Dec 2022 Posts: 419
|
Posted: Fri Apr 21, 2023 6:45 am Post subject: |
|
|
I have used KDE practically since its inception. I also have always used Nvidia cards and proprietary drivers. I have used every distro imaginable. I have never found suspend/resume/hibernate to be reliable. There has always been some issue or other. It seemed like I was always having to reboot just to get things back after sleep. These issues were not always limited to Linux but could also appear in Windows. I gave up on it years ago. I resorted to power settings that will turn the monitor off after a period of inactivity and if I leave the system for more than a few hours I just shut the system down. I have grown accustomed to it. In my own experience, sleep has always been more like a bug than a feature. I don't think it is an essential feature to say the least. |
|
Back to top |
|
|
Hu Administrator
Joined: 06 Mar 2007 Posts: 22612
|
Posted: Fri Apr 21, 2023 3:11 pm Post subject: |
|
|
Although this will not help OP since the target system requires the proprietary nVidia drivers, I will note that when last I used a system with an nVidia card, I used the Nouveau driver for it and had no problems with suspend or with hibernate (although, the standard disclaimer applies that the open drivers do not drive the card to its full potential, due to missing documentation from nVidia). nVidia hardware can be made to do this right. The problem is that apparently their proprietary drivers get it wrong. |
|
Back to top |
|
|
mcnutty Tux's lil' helper
Joined: 29 Dec 2009 Posts: 130
|
Posted: Fri Apr 21, 2023 6:55 pm Post subject: |
|
|
Thanks to everyone for the responses and moral support if nothing else.
logrusx wrote: | This is a very generic question.
Please provide details for your system. |
Unfortunately, as other people have pointed out I don't think the details would matter much. Similar to rab0171610, I've been having this problem since at least kernel version 4 (and probably earlier). In that time I've had a GTX 980, 1060, 1080, and now an RTX 4090. I've had both Intel and AMD cpus, I have been using some version of KDE since v3 (which I seem to remember working better but could just be nostalgia). I'm pretty sure I've tried at least 4-5 other desktop environments just to see if they work (gnome, xfce, cinnamon, lxde, lxqt) and despite my preference for KDE I would have switched if power management was reliable on those.
I have always stuck with OpenRC and never tried systemd. I suppose maybe things might work using systemd, but I'd (strongly) prefer to stick with OpenRC.
rab0171610 wrote: | I have grown accustomed to it. In my own experience, sleep has always been more like a bug than a feature. I don't think it is an essential feature to say the least. |
I used to more or less agree. Putting my monitors to sleep was good enough and getting the rest of the system to sleep wasn't worth the pain. With my new setup (I suspect the triple monitors are to blame) the GPU now idles around 40W. Electricity is very expensive where I live, so while it doesn't exactly break the bank, it does start to add up and feels very wasteful to leave the computer on while not using it.
I might consider shutting down completely, but rebooting is a bit painful. I have lots of windows open across many virtual desktops. While the session manager does a decent job of reopening the windows in the correct place it is not perfect and I always have to rearrange some things on a fresh boot. In combination with an extremely slow boot process due to a network drive issue I describe in another thread I try to avoid full reboots as much as possible.
Hu wrote: | nVidia hardware can be made to do this right. The problem is that apparently their proprietary drivers get it wrong. |
Frustrating, but glad to hear it can be done. Gives me at least some hope that this year will finally be the year the proprietary drivers get it right. |
|
Back to top |
|
|
logrusx Advocate
Joined: 22 Feb 2018 Posts: 2397
|
Posted: Sat Apr 22, 2023 11:45 am Post subject: |
|
|
mcnutty wrote: | Thanks to everyone for the responses and moral support if nothing else.
logrusx wrote: | This is a very generic question.
Please provide details for your system. |
Unfortunately, as other people have pointed out I don't think the details would matter much. |
Then how come you expect to find a solution to your problem at all?
Best Regards,
Georgi |
|
Back to top |
|
|
mcnutty Tux's lil' helper
Joined: 29 Dec 2009 Posts: 130
|
Posted: Thu Sep 12, 2024 4:38 pm Post subject: |
|
|
I'm not sure I'm ready to declare complete victory and mark this as solved yet, but upgrading to KDE/Plasma 6 has at least significantly reduced the problem. After about 10-20 sleep/wake cycles, I have yet to hang. |
|
Back to top |
|
|
shazeal Apprentice
Joined: 03 May 2006 Posts: 207 Location: New Zealand
|
Posted: Fri Sep 13, 2024 9:17 pm Post subject: |
|
|
mcnutty wrote: | I'm not sure I'm ready to declare complete victory and mark this as solved yet, but upgrading to KDE/Plasma 6 has at least significantly reduced the problem. After about 10-20 sleep/wake cycles, I have yet to hang. |
I always disable suspend with nvidia-drivers I have always had issues, either monitors do not resume, system panics, or corruption/bugs after resume. Saying that, I changed my kernel build script recently and forgot to disable suspend and a fresh install of KDE 6.1 suspended and it did work fine afterwards... however that was with the 560.35.3 (555 was the same) driver which constantly locks up windows/wayland. I have no idea what voodoo is required to get that driver working correclty so I have just reverted to 550 driver and X11 so cant test resume further as it is broken with 550. Honestly I feel like the nvidia driver is getting so much buggier recently, 525-550 was great, but I think all the new wayland stuff they have added recently is very unstable so even if they fix resume they are creating new issues.
Quote: | Any ideas on where the root problem is, where to look for solutions, or where to file a bug report? |
https://forums.developer.nvidia.com/t/560-release-feedback-discussion/300830
You post on there forums and hope that their one guy on there reads your problem among all the others. _________________ CFLAGS="-OmgWTFR1CE --fun-lol-loops --march=asmx86go" |
|
Back to top |
|
|
Ionen Developer
Joined: 06 Dec 2018 Posts: 2849
|
Posted: Fri Sep 13, 2024 10:46 pm Post subject: |
|
|
ftr https://github.com/gentoo/gentoo/pull/38482 is coming soon, not to say it will necessarily solve all issues and work for everyone (I know it doesn't), but a few users with several combinations of nvidia+systemd/elogind+wayland/xorg+plasma6 have successfully used suspend and resumed back into a system that seems to work as intended, so it should at least not be entirely broken by default.
PR could potentially make things worse for setups that don't use elogind nor systemd to sleep though (but easy to set back to =0 if needed), this is primarily catering to what Plasma and Gnome wants. |
|
Back to top |
|
|
Child_of_Sun_24 l33t
Joined: 28 Jul 2004 Posts: 601
|
Posted: Sat Sep 14, 2024 4:36 pm Post subject: |
|
|
I have Plasma6 running on Wayland with nvidia-drivers-550.107.02-r1 and i am using systemd as init system.
This is my nvidia.conf:
http://0x0.st/Xxnm.conf
Important is the use of "NVreg_PreserveVideoMemoryAllocations=1" and i have to enable the "nvidia-hibernate.service nvidia-resume.service nvidia-suspend.service", with this configuration suspend and resume are no problem, even with older drivers.
Under openrc i have seen a shell script which imitates the behavior of "nvidia-hibernate.service nvidia-resume.service nvidia-suspend.service" it can be found here https://forums.gentoo.org/viewtopic-p-8813895.html?sid=05052e14289f0e63a8413e0cc21f51cb but i haven't testet it because i am using systemd.
I hope this helps a bit. |
|
Back to top |
|
|
Ionen Developer
Joined: 06 Dec 2018 Posts: 2849
|
Posted: Sat Sep 14, 2024 8:14 pm Post subject: |
|
|
The PR in the post above yours essentially does that, it enables these services by default, installs a elogind hook, and sets =1.
It's also merged now, aka nvidia-drivers-550.107.02-r1 has these changes. Albeit if you already had =1, use systemd, and enabled the services, it does nothing new for you |
|
Back to top |
|
|
eeckwrk99 Apprentice
Joined: 14 Mar 2021 Posts: 232 Location: Gentoo forums
|
Posted: Mon Sep 16, 2024 11:01 am Post subject: |
|
|
My setup:
- GTX 970
- systemd
- X11
- AwesomeWM
Now that NVreg_PreserveVideoMemoryAllocations=1 is the default, I had to revert the change as it completely breaks suspend/hibernate on my end:
Code: | # sed -i 's/NVreg_PreserveVideoMemoryAllocations=1/NVreg_PreserveVideoMemoryAllocations=0/g' /etc/modprobe.d/nvidia.conf
# systemctl disable nvidia-{hibernate,resume,suspend}.service |
Plus, I have to stick with 6.1 kernel + 535 branch because I've been unable to resume from suspend/hibernate with any newer combo for the past year or so.
What is everyone experience with PreserveVideoMemoryAllocations=1? Just curious. |
|
Back to top |
|
|
logrusx Advocate
Joined: 22 Feb 2018 Posts: 2397
|
Posted: Mon Sep 16, 2024 3:08 pm Post subject: |
|
|
eeckwrk99 wrote: |
Plus, I have to stick with 6.1 kernel + 535 branch because I've been unable to resume from suspend/hibernate with any newer combo for the past year or so. |
Maybe related:
I've been having similar issue for a similar period of time (maybe, because I keep my laptop mostly plugged in and don't know when it actually happened). When the system is put to sleep on power and woken up on battery, it fails to resume. I read somewhere it was a kernel fix (yes, not bug), which has revealed firmware bug so they've reverted it. However I myself was unable to resume cleanly even with 6.9. I don't remember having tried 6.10, but I should.
Best Regards,
Georgi |
|
Back to top |
|
|
gekk2 n00b
Joined: 18 Sep 2024 Posts: 2
|
Posted: Wed Sep 18, 2024 6:53 am Post subject: |
|
|
Hi, I encountered the same issue with kernel versions 6.6 and later.
After some investigation, I found that the problem is linked to the following kernel option: Mitigations for CPU vulnerabilities > Mitigate RSB underflow with call depth tracking
This option was introduced in version 6.2 and is specifically intended for Intel CPUs based on the Skylake architecture (like mine).
Once I disabled it, the resume from suspend started working correctly with NVIDIA again. |
|
Back to top |
|
|
eeckwrk99 Apprentice
Joined: 14 Mar 2021 Posts: 232 Location: Gentoo forums
|
Posted: Wed Sep 18, 2024 8:27 am Post subject: |
|
|
gekk2 wrote: | Hi, I encountered the same issue with kernel versions 6.6 and later.
After some investigation, I found that the problem is linked to the following kernel option: Mitigations for CPU vulnerabilities > Mitigate RSB underflow with call depth tracking
This option was introduced in version 6.2 and is specifically intended for Intel CPUs based on the Skylake architecture (like mine).
Once I disabled it, the resume from suspend started working correctly with NVIDIA again. |
Interesting, I have an Haswell Intel CPU and CONFIG_CALL_DEPTH_TRACKING=y in my 6.6 kernel config.
May I ask what is your GPU model and which NVIDIA drivers series you're using to be able to resume suspend with >= 6.6 kernel? |
|
Back to top |
|
|
42n4 n00b
Joined: 10 Feb 2015 Posts: 22
|
|
Back to top |
|
|
eeckwrk99 Apprentice
Joined: 14 Mar 2021 Posts: 232 Location: Gentoo forums
|
Posted: Wed Sep 18, 2024 12:11 pm Post subject: |
|
|
What's your GPU model? |
|
Back to top |
|
|
42n4 n00b
Joined: 10 Feb 2015 Posts: 22
|
Posted: Wed Sep 18, 2024 1:16 pm Post subject: |
|
|
eeckwrk99 wrote: |
What's your GPU model? |
I have NVIDIA 4060 and AMD FURY R9 X. Both have an excellent support in Gentoo, but NVIDIA binary one and for gcc 13 only (Hyprland crashes for gcc14/15, CUDA). By the way in wayland I use icc profile: sRGB2014.icc from https://www.color.org/srgbprofiles.xalter _________________ OS: Gentoo 2.15 gcc13/14
Kernel: Linux 6.11.3-zen1
KDE Plasma 6.2.0
WM: NVIDIA 4060/AMD Wayland
http://bit.ly/gen2ls |
|
Back to top |
|
|
eeckwrk99 Apprentice
Joined: 14 Mar 2021 Posts: 232 Location: Gentoo forums
|
Posted: Wed Sep 18, 2024 2:14 pm Post subject: |
|
|
42n4 wrote: | eeckwrk99 wrote: |
What's your GPU model? |
I have NVIDIA 4060 and AMD FURY R9 X. Both have an excellent support in Gentoo, but NVIDIA binary one and for gcc 13 only (Hyprland crashes for gcc14/15, CUDA). By the way in wayland I use icc profile: sRGB2014.icc from https://www.color.org/srgbprofiles.xalter |
I see. The issue I'm having seems to be tied to older cards, see this Arch Linux forums topic (multiple reports from GTX 970 / 980 Ti / 1660 users). |
|
Back to top |
|
|
42n4 n00b
Joined: 10 Feb 2015 Posts: 22
|
Posted: Thu Sep 19, 2024 8:56 am Post subject: |
|
|
eeckwrk99 wrote: | 42n4 wrote: | eeckwrk99 wrote: |
What's your GPU model? |
I have NVIDIA 4060 and AMD FURY R9 X. Both have an excellent support in Gentoo, but NVIDIA binary one and for gcc 13 only (Hyprland crashes for gcc14/15, CUDA). By the way in wayland I use icc profile: sRGB2014.icc from https://www.color.org/srgbprofiles.xalter |
I see. The issue I'm having seems to be tied to older cards, see this Arch Linux forums topic (multiple reports from GTX 970 / 980 Ti / 1660 users). |
You have to experiment with kernel parameters: 'nvidia_drm.modeset=1 nvidia_drm.fbdev=1 nvidia.NVreg_PreserveVideoMemoryAllocations=1'
in:
CONFIG_CMDLINE_BOOL=y
CONFIG_CMDLINE="nvidia_drm.modeset=1 nvidia_drm.fbdev=1 nvidia.NVreg_PreserveVideoMemoryAllocations=1"
Just compile with all, reboot and test, remove one and compile and so on. I know those parameters are in /etc/modprobe.d/nvidia.conf, but if you use initramfs they can be omitted.
If you want I can share my kernel config here for zen-sources. Maybe it is not perfect but it is working with many services and programs. It is based on default zen config and some additions e.g. for framebuffer. I added a lot of modules to have one kernel for all computers. I only change CONFIG_MNATIVE_INTEL or CONFIG_MNATIVE_AMD. _________________ OS: Gentoo 2.15 gcc13/14
Kernel: Linux 6.11.3-zen1
KDE Plasma 6.2.0
WM: NVIDIA 4060/AMD Wayland
http://bit.ly/gen2ls |
|
Back to top |
|
|
eeckwrk99 Apprentice
Joined: 14 Mar 2021 Posts: 232 Location: Gentoo forums
|
Posted: Thu Sep 19, 2024 1:44 pm Post subject: |
|
|
42n4 wrote: | You have to experiment with kernel parameters: 'nvidia_drm.modeset=1 nvidia_drm.fbdev=1 nvidia.NVreg_PreserveVideoMemoryAllocations=1'
in:
CONFIG_CMDLINE_BOOL=y
CONFIG_CMDLINE="nvidia_drm.modeset=1 nvidia_drm.fbdev=1 nvidia.NVreg_PreserveVideoMemoryAllocations=1"
Just compile with all, reboot and test, remove one and compile and so on. I know those parameters are in /etc/modprobe.d/nvidia.conf, but if you use initramfs they can be omitted.
If you want I can share my kernel config here for zen-sources. Maybe it is not perfect but it is working with many services and programs. It is based on default zen config and some additions e.g. for framebuffer. I added a lot of modules to have one kernel for all computers. I only change CONFIG_MNATIVE_INTEL or CONFIG_MNATIVE_AMD. |
Thanks, I'll try different combinations and see how it goes. |
|
Back to top |
|
|
eeckwrk99 Apprentice
Joined: 14 Mar 2021 Posts: 232 Location: Gentoo forums
|
Posted: Sat Sep 21, 2024 1:22 pm Post subject: |
|
|
gekk2 wrote: | Hi, I encountered the same issue with kernel versions 6.6 and later.
After some investigation, I found that the problem is linked to the following kernel option: Mitigations for CPU vulnerabilities > Mitigate RSB underflow with call depth tracking
This option was introduced in version 6.2 and is specifically intended for Intel CPUs based on the Skylake architecture (like mine).
Once I disabled it, the resume from suspend started working correctly with NVIDIA again. |
Disabling CONFIG_CALL_DEPTH_TRACKING (Mitigate RSB underflow with call depth tracking) in my 6.6 kernel config didn't help unfortunately.
Edit: I've just tried again with x11-drivers/nvidia-drivers-550.107.02-r1 instead of 535.183.01-r1 and it seems to actually solve the problem. I'll test for a longer period just to make sure. I also added/enabled some NVIDIA related stuff: dracut modules, nvidia-{hibernate,resume,suspend}.service units, enabling nvidia-drm modeset=1 and options nvidia-drm fbdev=1, NVreg_PreserveVideoMemoryAllocations=1...).
I was able to suspend/resume 3 times in a row, but the 4th attempt resulted in black screen when resuming. With 6.1 kernel, I can suspend/resume consistently, never had any failure (I suspend/resume multiple times a day and have been doing so for months).
42n4 wrote: | You have to experiment with kernel parameters: 'nvidia_drm.modeset=1 nvidia_drm.fbdev=1 nvidia.NVreg_PreserveVideoMemoryAllocations=1'
in:
CONFIG_CMDLINE_BOOL=y
CONFIG_CMDLINE="nvidia_drm.modeset=1 nvidia_drm.fbdev=1 nvidia.NVreg_PreserveVideoMemoryAllocations=1"
Just compile with all, reboot and test, remove one and compile and so on. I know those parameters are in /etc/modprobe.d/nvidia.conf, but if you use initramfs they can be omitted.
If you want I can share my kernel config here for zen-sources. Maybe it is not perfect but it is working with many services and programs. It is based on default zen config and some additions e.g. for framebuffer. I added a lot of modules to have one kernel for all computers. I only change CONFIG_MNATIVE_INTEL or CONFIG_MNATIVE_AMD. |
I've tried pretty much all combinations today, none of them solved the issue. PreserveVideoMemoryAllocations=1 seems to be the worst option since my system just won't suspend anymore with it. I wish I could use it since it fixes the graphical glitches when resuming from suspend but that's how it is. Maybe it just doesn't play well with older cards.
I'd be fine if I could resume from suspend reliably with >=6.6 kernel and >535 NVIDIA drivers but even with PreserveVideoMemoryAllocations=0, none of the other combinations worked. I keep getting black screen when resuming. I've been dealing with this issue for almost a year now... I guess I'm stuck with 6.1 kernel and 535 drivers until they reach EOL (December 2026 and June 2026 respectively). Then, I'll just buy an AMD GPU.
Last edited by eeckwrk99 on Wed Sep 25, 2024 12:47 pm; edited 1 time in total |
|
Back to top |
|
|
42n4 n00b
Joined: 10 Feb 2015 Posts: 22
|
Posted: Mon Sep 23, 2024 9:55 am Post subject: |
|
|
Maybe with my kernel config it would be a small chance... or just next disaster _________________ OS: Gentoo 2.15 gcc13/14
Kernel: Linux 6.11.3-zen1
KDE Plasma 6.2.0
WM: NVIDIA 4060/AMD Wayland
http://bit.ly/gen2ls |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|