Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Random System-Wide Freezes - AMDGPU (Ryzen 3 3200G / Vega)
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
zedxot
n00b
n00b


Joined: 26 Jul 2022
Posts: 64
Location: Dhaka, Bangladesh

PostPosted: Tue Mar 11, 2025 10:57 pm    Post subject: Random System-Wide Freezes - AMDGPU (Ryzen 3 3200G / Vega) Reply with quote

Hello,

My system is freezing completely, requiring a hard power off and reboot each time. I am seeking advice on how to diagnose and resolve this problem.

System Configuration:
Kernel Version: 6.12.16-gentoo-x86_64 (issue has been present on other recent kernels as well)
Processor (CPU): AMD Ryzen 3 3200G with Radeon Vega Graphics (Picasso APU)
Integrated Graphics (GPU): Radeon Vega Graphics (integrated in Ryzen 3200G)
Motherboard: MSI B450M PRO-M2 V2 (MS-7B84)
BIOS Version: 4.F3 (latest version from MSI)
Memory (RAM): 12GB DDR4 3200MHz (8GB + 4GB)
Window Manager: DWM

Description of the Problem: System-Wide Freezes

My Gentoo system is experiencing frequent and complete system freezes. When a freeze occurs, the entire computer becomes unresponsive – the screen freezes, and I cannot use the keyboard or mouse. The only way to recover is to power off the computer and restart it.

When do the freezes happen? The freezes seem to happen somewhat randomly during normal desktop use. I have not been able to reproduce the freezes with a dedicated GPU stress test.

What happens during a freeze?
  • The screen image freezes completely.
  • The computer becomes completely unresponsive to any input. Keyboard presses and mouse movements have no effect.
  • The only way to regain control is to perform a hard power cycle.


How often do the freezes occur? The freezes are happening multiple times per day, which makes my system unreliable for daily use.

Troubleshooting Steps I Have Already Tried:

So far, I have tried the following kernel parameters:

amdgpu.aspm=0 (Disable PCIe Active State Power Management):
Result: Using amdgpu.aspm=0 might have slightly reduced how often the freezes happen, but the problem is still present.

amdgpu.dpm=0 (Disable AMDGPU Dynamic Power Management):
Result: System will not boot. When I add amdgpu.dpm=0 to my kernel parameters, my system fails to boot properly. It stops booting early in the process, and I have to remove the parameter to boot again.

radeon.dpm=1 and amdgpu.aspm=0 (Use Radeon DPM, disable AMDGPU ASPM):
Result: Freezes still happen. Trying to use the radeon driver's dynamic power management with radeon.dpm=1 did not fix the freezing problem.

Tests I Have Performed:

  • Memory Test (Memtest86+): I have run Memtest86+ to check my system's RAM. The test completed with no errors.
  • GPU Stress Test: I performed a GPU stress test with stress-ng to see if I could trigger a freeze under heavy GPU load. The freeze is not reproducible with the stress test.
  • Temperature Monitoring: I have checked my CPU and GPU temperatures using system sensors. My system is not overheating before or during the freezes.


Relevant Log Information:

Here are some messages I have noticed in my system logs that might be related:

Boot Log Errors (These messages appear every time I boot, even with kernel parameters):

Code:
Mar 11 00:55:25 Eve kernel: amdgpu 0000:30:00.0: amdgpu: psp gfx command LOAD_TA(0x1) failed and response status is (0x7)
Mar 11 00:55:25 Eve kernel: amdgpu 0000:30:00.0: amdgpu: psp gfx command INVOKE_CMD(0x3) failed and response status is (0x4)
Mar 11 00:55:25 Eve kernel: amdgpu 0000:30:00.0: amdgpu: Secure display: Generic Failure.
Mar 11 00:55:25 Eve kernel: amdgpu 0000:30:00.0: amdgpu: SECUREDISPLAY: query securedisplay TA failed. ret 0x0


Log Message Seen During a Freeze Event:

Code:
Mar 11 04:05:52 Eve kernel: amdgpu 0000:30:00.0: amdgpu: Dumping IP State
Mar 11 04:04:25 Eve kernel: perf: interrupt took too long (2513 > 2500), lowering kernel.perf_event_max_sample_rate to 79500


Thank you very much for any help and suggestions you can provide. I am ready to try different solutions and provide more information as needed.
_________________
Don't talk to me about old, I walk in eternity!
Back to top
View user's profile Send private message
Anon-E-moose
Watchman
Watchman


Joined: 23 May 2008
Posts: 6247
Location: Dallas area

PostPosted: Tue Mar 11, 2025 11:52 pm    Post subject: Reply with quote

It looks like you have secure display set, I would turn it off as I don't believe it necessary.

DRM_AMD_SECURE_DISPLAY is probably set to Y

Try it with that disabled (rebuild kernel and modules) and let us know what happened.
If it still doesn't work then wgetpaste your .config file
_________________
UM780, 6.12 zen kernel, gcc 13, openrc, wayland
Back to top
View user's profile Send private message
zedxot
n00b
n00b


Joined: 26 Jul 2022
Posts: 64
Location: Dhaka, Bangladesh

PostPosted: Wed Mar 12, 2025 7:50 am    Post subject: Reply with quote

Anon-E-moose wrote:
It looks like you have secure display set, I would turn it off as I don't believe it necessary.

DRM_AMD_SECURE_DISPLAY is probably set to Y

Try it with that disabled (rebuild kernel and modules) and let us know what happened.
If it still doesn't work then wgetpaste your .config file


Seems like DRM_AMD_SECURE_DISPLAY is set to N.

Code:
Symbol: DRM_AMD_SECURE_DISPLAY [=n]                                                                                               │
  │ Type  : bool                                                                                                                      │
  │ Defined at drivers/gpu/drm/amd/display/Kconfig:46                                                                                 │
  │   Prompt: Enable secure display support                                                                                           │
  │   Depends on: HAS_IOMEM [=y] && DRM [=m] && DRM_AMDGPU [=m] && DEBUG_FS [=y] && DRM_AMD_DC_FP [=y]                                │
  │   Location:                                                                                                                       │
  │     -> Device Drivers                                                                                                             │
  │       -> Graphics support                                                                                                         │
  │         -> Direct Rendering Manager (XFree86 4.1.0 and higher DRI support) (DRM [=m])                                             │
  │           -> AMD GPU (DRM_AMDGPU [=m])                                                                                            │
  │             -> Display Engine Configuration                                                                                       │
  │ (1)           -> Enable secure display support (DRM_AMD_SECURE_DISPLAY [=n])


Here is my .config

http://dpaste.com/A4K9QHJ37
_________________
Don't talk to me about old, I walk in eternity!
Back to top
View user's profile Send private message
Anon-E-moose
Watchman
Watchman


Joined: 23 May 2008
Posts: 6247
Location: Dallas area

PostPosted: Wed Mar 12, 2025 9:17 am    Post subject: Reply with quote

You have a lot of stuff checked that I'm not sure you need, Selinux options, IMA options, etc.
Multiple displays selected amdgpu, radeon, looks like maybe i915, etc.

What hardware do you have? Model of laptop or motherboard model and any video cards that have been added (not on-board)
_________________
UM780, 6.12 zen kernel, gcc 13, openrc, wayland
Back to top
View user's profile Send private message
zedxot
n00b
n00b


Joined: 26 Jul 2022
Posts: 64
Location: Dhaka, Bangladesh

PostPosted: Wed Mar 12, 2025 9:48 am    Post subject: Reply with quote

Anon-E-moose wrote:
You have a lot of stuff checked that I'm not sure you need, Selinux options, IMA options, etc.
Multiple displays selected amdgpu, radeon, looks like maybe i915, etc.

What hardware do you have? Model of laptop or motherboard model and any video cards that have been added (not on-board)


System Configuration:
Kernel Version: 6.12.16-gentoo-x86_64 (issue has been present on other recent kernels as well)
Processor (CPU): AMD Ryzen 3 3200G with Radeon Vega Graphics (Picasso APU)
Integrated Graphics (GPU): Radeon Vega Graphics (integrated in Ryzen 3200G)
Motherboard: MSI B450M PRO-M2 V2 (MS-7B84)
BIOS Version: 4.F3 (latest version from MSI)
Memory (RAM): 12GB DDR4 3200MHz (8GB + 4GB)
Window Manager: DWM

No, I don't have any external video card.
_________________
Don't talk to me about old, I walk in eternity!
Back to top
View user's profile Send private message
Anon-E-moose
Watchman
Watchman


Joined: 23 May 2008
Posts: 6247
Location: Dallas area

PostPosted: Wed Mar 12, 2025 10:30 am    Post subject: Reply with quote

Lets try this, there should be a file /etc/modprobe.d/blacklist.conf
edit that and add blacklists for radeon, radeonsi, i915 and reboot, does that change things?
_________________
UM780, 6.12 zen kernel, gcc 13, openrc, wayland
Back to top
View user's profile Send private message
zedxot
n00b
n00b


Joined: 26 Jul 2022
Posts: 64
Location: Dhaka, Bangladesh

PostPosted: Wed Mar 12, 2025 1:03 pm    Post subject: Reply with quote

Anon-E-moose wrote:
Lets try this, there should be a file /etc/modprobe.d/blacklist.conf
edit that and add blacklists for radeon, radeonsi, i915 and reboot, does that change things?


Hmm, I do not have the file /etc/modprobe.d/blacklist.conf, or the directory /etc/modeprobe.d/. The only similar directory is /etc/modules-load.d, with lm_sensors.conf in it. Should I create the blacklists file here or create the modeprobe.d directory?!
_________________
Don't talk to me about old, I walk in eternity!
Back to top
View user's profile Send private message
Anon-E-moose
Watchman
Watchman


Joined: 23 May 2008
Posts: 6247
Location: Dallas area

PostPosted: Wed Mar 12, 2025 2:20 pm    Post subject: Reply with quote

Do you have any of these 4?
Code:
       /lib/modprobe.d/
       /usr/local/lib/modprobe.d/
       /run/modprobe.d/
       /etc/modprobe.d/


If you have one of the are there these files in it, aliases.conf, alsa.conf, blacklist.conf, options.conf
_________________
UM780, 6.12 zen kernel, gcc 13, openrc, wayland
Back to top
View user's profile Send private message
zedxot
n00b
n00b


Joined: 26 Jul 2022
Posts: 64
Location: Dhaka, Bangladesh

PostPosted: Wed Mar 12, 2025 7:26 pm    Post subject: Reply with quote

Anon-E-moose wrote:
Do you have any of these 4?
Code:
       /lib/modprobe.d/
       /usr/local/lib/modprobe.d/
       /run/modprobe.d/
       /etc/modprobe.d/


If you have one of the are there these files in it, aliases.conf, alsa.conf, blacklist.conf, options.conf


found /lib/modprobe.d/dist-blacklist.conf

Code:
blacklist radeon
blacklist radeonsi
blacklist i915


As the freezes are random and there is no specific trigger to reproduce the freeze, I'll update if the freeze occurs or not.
_________________
Don't talk to me about old, I walk in eternity!
Back to top
View user's profile Send private message
tomtom69
Apprentice
Apprentice


Joined: 09 Nov 2010
Posts: 256
Location: Bavaria

PostPosted: Wed Mar 12, 2025 7:48 pm    Post subject: Reply with quote

I might have the same issue with an AMD 3350G.
Can you check whether your system is accessible via ssh?
When the error occurs here I am able to connect via ssh (even if it is very slow) and I can see a process with kworker/u32:0+amdgpu-reset-dev consuming 100% CPU.
I downgraded kernel to 6.6.74 and upgraded mesa to 25.0.1 due to this bug report:
https://bugs.gentoo.org/947181
At the moment I do not have any freezes, but I did not test for a longer time.
Maybe downgrading kernel and upgrading mesa is worth a try.
Back to top
View user's profile Send private message
zedxot
n00b
n00b


Joined: 26 Jul 2022
Posts: 64
Location: Dhaka, Bangladesh

PostPosted: Wed Mar 12, 2025 10:12 pm    Post subject: Reply with quote

tomtom69 wrote:
I might have the same issue with an AMD 3350G.
Can you check whether your system is accessible via ssh?
When the error occurs here I am able to connect via ssh (even if it is very slow) and I can see a process with kworker/u32:0+amdgpu-reset-dev consuming 100% CPU.
I downgraded kernel to 6.6.74 and upgraded mesa to 25.0.1 due to this bug report:
https://bugs.gentoo.org/947181
At the moment I do not have any freezes, but I did not test for a longer time.
Maybe downgrading kernel and upgrading mesa is worth a try.


Our issue might not be the same. Cause I've been facing this issue since 6.6.67. And the syslog was showing a amdgpu timeout error on gfx low ring. After adding amdgpu.aspm=0 kernel parameter the freezes was less frequent. I tried several kernels including 6.6.74 (more frequent freezes), 6.6.30 (less frequent freezes), 6.1.127 (same as 6.6.30). Finally on 6.12.16 was showing a different syslog which is amdgpu: Dumping IP State.

UPDATE: After blacklisting radeon, radeonsi and i915, didn't experience any freeze yet. Still testing.
_________________
Don't talk to me about old, I walk in eternity!
Back to top
View user's profile Send private message
Anon-E-moose
Watchman
Watchman


Joined: 23 May 2008
Posts: 6247
Location: Dallas area

PostPosted: Wed Mar 12, 2025 10:27 pm    Post subject: Reply with quote

You may need to blacklist other modules. Keep an eye on dmesg output for things not working right.
_________________
UM780, 6.12 zen kernel, gcc 13, openrc, wayland
Back to top
View user's profile Send private message
pietinger
Moderator
Moderator


Joined: 17 Oct 2006
Posts: 5489
Location: Bavaria

PostPosted: Wed Mar 12, 2025 11:45 pm    Post subject: Reply with quote

zedxot wrote:
[...] After blacklisting radeon, radeonsi and i915, didn't experience any freeze yet. Still testing.

zedxot,

another possibility than blacklisting modules is to build a kernel in which these modules are not installed at all ... ;-)

As @Anon-E-moose already said, you have activated a lot of unnecessary options in your kernel. In addition, there are hundreds of unnecessary modules that you will never need. This makes for an extremely long kernel compilation time. Are you a kernel programmer? If not, then you should really deactivate some debug options (e.g.: CONFIG_DYNAMIC_DEBUG=y; CONFIG_KGDB=y; CONFIG_DEBUG_VM=y) ... or best of all configure a completely new - and lean - kernel. :D If you want to go this way, this might help you:
https://wiki.gentoo.org/wiki/User:Pietinger/Tutorials/Manual_kernel_configuration
https://wiki.gentoo.org/wiki/User:Pietinger/Tutorials/Manual_Configuring_Kernel_Version_6.12
To get you interested, you could start with this chapter:
https://wiki.gentoo.org/wiki/User:Pietinger/Tutorials/Manual_kernel_configuration#CONFIG_DEBUG_.3F
_________________
https://wiki.gentoo.org/wiki/User:Pietinger
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum