View previous topic :: View next topic |
Author |
Message |
zedxot n00b


Joined: 26 Jul 2022 Posts: 64 Location: Dhaka, Bangladesh
|
Posted: Tue Mar 11, 2025 10:57 pm Post subject: Random System-Wide Freezes - AMDGPU (Ryzen 3 3200G / Vega) |
|
|
Hello,
My system is freezing completely, requiring a hard power off and reboot each time. I am seeking advice on how to diagnose and resolve this problem.
System Configuration:
Kernel Version: 6.12.16-gentoo-x86_64 (issue has been present on other recent kernels as well)
Processor (CPU): AMD Ryzen 3 3200G with Radeon Vega Graphics (Picasso APU)
Integrated Graphics (GPU): Radeon Vega Graphics (integrated in Ryzen 3200G)
Motherboard: MSI B450M PRO-M2 V2 (MS-7B84)
BIOS Version: 4.F3 (latest version from MSI)
Memory (RAM): 12GB DDR4 3200MHz (8GB + 4GB)
Window Manager: DWM
Description of the Problem: System-Wide Freezes
My Gentoo system is experiencing frequent and complete system freezes. When a freeze occurs, the entire computer becomes unresponsive – the screen freezes, and I cannot use the keyboard or mouse. The only way to recover is to power off the computer and restart it.
When do the freezes happen? The freezes seem to happen somewhat randomly during normal desktop use. I have not been able to reproduce the freezes with a dedicated GPU stress test.
What happens during a freeze?
- The screen image freezes completely.
- The computer becomes completely unresponsive to any input. Keyboard presses and mouse movements have no effect.
- The only way to regain control is to perform a hard power cycle.
How often do the freezes occur? The freezes are happening multiple times per day, which makes my system unreliable for daily use.
Troubleshooting Steps I Have Already Tried:
So far, I have tried the following kernel parameters:
amdgpu.aspm=0 (Disable PCIe Active State Power Management):
Result: Using amdgpu.aspm=0 might have slightly reduced how often the freezes happen, but the problem is still present.
amdgpu.dpm=0 (Disable AMDGPU Dynamic Power Management):
Result: System will not boot. When I add amdgpu.dpm=0 to my kernel parameters, my system fails to boot properly. It stops booting early in the process, and I have to remove the parameter to boot again.
radeon.dpm=1 and amdgpu.aspm=0 (Use Radeon DPM, disable AMDGPU ASPM):
Result: Freezes still happen. Trying to use the radeon driver's dynamic power management with radeon.dpm=1 did not fix the freezing problem.
Tests I Have Performed:
- Memory Test (Memtest86+): I have run Memtest86+ to check my system's RAM. The test completed with no errors.
- GPU Stress Test: I performed a GPU stress test with stress-ng to see if I could trigger a freeze under heavy GPU load. The freeze is not reproducible with the stress test.
- Temperature Monitoring: I have checked my CPU and GPU temperatures using system sensors. My system is not overheating before or during the freezes.
Relevant Log Information:
Here are some messages I have noticed in my system logs that might be related:
Boot Log Errors (These messages appear every time I boot, even with kernel parameters):
Code: | Mar 11 00:55:25 Eve kernel: amdgpu 0000:30:00.0: amdgpu: psp gfx command LOAD_TA(0x1) failed and response status is (0x7)
Mar 11 00:55:25 Eve kernel: amdgpu 0000:30:00.0: amdgpu: psp gfx command INVOKE_CMD(0x3) failed and response status is (0x4)
Mar 11 00:55:25 Eve kernel: amdgpu 0000:30:00.0: amdgpu: Secure display: Generic Failure.
Mar 11 00:55:25 Eve kernel: amdgpu 0000:30:00.0: amdgpu: SECUREDISPLAY: query securedisplay TA failed. ret 0x0 |
Log Message Seen During a Freeze Event:
Code: | Mar 11 04:05:52 Eve kernel: amdgpu 0000:30:00.0: amdgpu: Dumping IP State
Mar 11 04:04:25 Eve kernel: perf: interrupt took too long (2513 > 2500), lowering kernel.perf_event_max_sample_rate to 79500 |
Thank you very much for any help and suggestions you can provide. I am ready to try different solutions and provide more information as needed. _________________ Don't talk to me about old, I walk in eternity! |
|
Back to top |
|
 |
Anon-E-moose Watchman


Joined: 23 May 2008 Posts: 6247 Location: Dallas area
|
Posted: Tue Mar 11, 2025 11:52 pm Post subject: |
|
|
It looks like you have secure display set, I would turn it off as I don't believe it necessary.
DRM_AMD_SECURE_DISPLAY is probably set to Y
Try it with that disabled (rebuild kernel and modules) and let us know what happened.
If it still doesn't work then wgetpaste your .config file _________________ UM780, 6.12 zen kernel, gcc 13, openrc, wayland |
|
Back to top |
|
 |
zedxot n00b


Joined: 26 Jul 2022 Posts: 64 Location: Dhaka, Bangladesh
|
Posted: Wed Mar 12, 2025 7:50 am Post subject: |
|
|
Anon-E-moose wrote: | It looks like you have secure display set, I would turn it off as I don't believe it necessary.
DRM_AMD_SECURE_DISPLAY is probably set to Y
Try it with that disabled (rebuild kernel and modules) and let us know what happened.
If it still doesn't work then wgetpaste your .config file |
Seems like DRM_AMD_SECURE_DISPLAY is set to N.
Code: | Symbol: DRM_AMD_SECURE_DISPLAY [=n] │
│ Type : bool │
│ Defined at drivers/gpu/drm/amd/display/Kconfig:46 │
│ Prompt: Enable secure display support │
│ Depends on: HAS_IOMEM [=y] && DRM [=m] && DRM_AMDGPU [=m] && DEBUG_FS [=y] && DRM_AMD_DC_FP [=y] │
│ Location: │
│ -> Device Drivers │
│ -> Graphics support │
│ -> Direct Rendering Manager (XFree86 4.1.0 and higher DRI support) (DRM [=m]) │
│ -> AMD GPU (DRM_AMDGPU [=m]) │
│ -> Display Engine Configuration │
│ (1) -> Enable secure display support (DRM_AMD_SECURE_DISPLAY [=n]) |
Here is my .config
http://dpaste.com/A4K9QHJ37 _________________ Don't talk to me about old, I walk in eternity! |
|
Back to top |
|
 |
Anon-E-moose Watchman


Joined: 23 May 2008 Posts: 6247 Location: Dallas area
|
Posted: Wed Mar 12, 2025 9:17 am Post subject: |
|
|
You have a lot of stuff checked that I'm not sure you need, Selinux options, IMA options, etc.
Multiple displays selected amdgpu, radeon, looks like maybe i915, etc.
What hardware do you have? Model of laptop or motherboard model and any video cards that have been added (not on-board) _________________ UM780, 6.12 zen kernel, gcc 13, openrc, wayland |
|
Back to top |
|
 |
zedxot n00b


Joined: 26 Jul 2022 Posts: 64 Location: Dhaka, Bangladesh
|
Posted: Wed Mar 12, 2025 9:48 am Post subject: |
|
|
Anon-E-moose wrote: | You have a lot of stuff checked that I'm not sure you need, Selinux options, IMA options, etc.
Multiple displays selected amdgpu, radeon, looks like maybe i915, etc.
What hardware do you have? Model of laptop or motherboard model and any video cards that have been added (not on-board) |
System Configuration:
Kernel Version: 6.12.16-gentoo-x86_64 (issue has been present on other recent kernels as well)
Processor (CPU): AMD Ryzen 3 3200G with Radeon Vega Graphics (Picasso APU)
Integrated Graphics (GPU): Radeon Vega Graphics (integrated in Ryzen 3200G)
Motherboard: MSI B450M PRO-M2 V2 (MS-7B84)
BIOS Version: 4.F3 (latest version from MSI)
Memory (RAM): 12GB DDR4 3200MHz (8GB + 4GB)
Window Manager: DWM
No, I don't have any external video card. _________________ Don't talk to me about old, I walk in eternity! |
|
Back to top |
|
 |
Anon-E-moose Watchman


Joined: 23 May 2008 Posts: 6247 Location: Dallas area
|
Posted: Wed Mar 12, 2025 10:30 am Post subject: |
|
|
Lets try this, there should be a file /etc/modprobe.d/blacklist.conf
edit that and add blacklists for radeon, radeonsi, i915 and reboot, does that change things? _________________ UM780, 6.12 zen kernel, gcc 13, openrc, wayland |
|
Back to top |
|
 |
zedxot n00b


Joined: 26 Jul 2022 Posts: 64 Location: Dhaka, Bangladesh
|
Posted: Wed Mar 12, 2025 1:03 pm Post subject: |
|
|
Anon-E-moose wrote: | Lets try this, there should be a file /etc/modprobe.d/blacklist.conf
edit that and add blacklists for radeon, radeonsi, i915 and reboot, does that change things? |
Hmm, I do not have the file /etc/modprobe.d/blacklist.conf, or the directory /etc/modeprobe.d/. The only similar directory is /etc/modules-load.d, with lm_sensors.conf in it. Should I create the blacklists file here or create the modeprobe.d directory?! _________________ Don't talk to me about old, I walk in eternity! |
|
Back to top |
|
 |
Anon-E-moose Watchman


Joined: 23 May 2008 Posts: 6247 Location: Dallas area
|
Posted: Wed Mar 12, 2025 2:20 pm Post subject: |
|
|
Do you have any of these 4?
Code: | /lib/modprobe.d/
/usr/local/lib/modprobe.d/
/run/modprobe.d/
/etc/modprobe.d/ |
If you have one of the are there these files in it, aliases.conf, alsa.conf, blacklist.conf, options.conf _________________ UM780, 6.12 zen kernel, gcc 13, openrc, wayland |
|
Back to top |
|
 |
zedxot n00b


Joined: 26 Jul 2022 Posts: 64 Location: Dhaka, Bangladesh
|
Posted: Wed Mar 12, 2025 7:26 pm Post subject: |
|
|
Anon-E-moose wrote: | Do you have any of these 4?
Code: | /lib/modprobe.d/
/usr/local/lib/modprobe.d/
/run/modprobe.d/
/etc/modprobe.d/ |
If you have one of the are there these files in it, aliases.conf, alsa.conf, blacklist.conf, options.conf |
found /lib/modprobe.d/dist-blacklist.conf
Code: | blacklist radeon
blacklist radeonsi
blacklist i915 |
As the freezes are random and there is no specific trigger to reproduce the freeze, I'll update if the freeze occurs or not. _________________ Don't talk to me about old, I walk in eternity! |
|
Back to top |
|
 |
tomtom69 Apprentice

Joined: 09 Nov 2010 Posts: 256 Location: Bavaria
|
Posted: Wed Mar 12, 2025 7:48 pm Post subject: |
|
|
I might have the same issue with an AMD 3350G.
Can you check whether your system is accessible via ssh?
When the error occurs here I am able to connect via ssh (even if it is very slow) and I can see a process with kworker/u32:0+amdgpu-reset-dev consuming 100% CPU.
I downgraded kernel to 6.6.74 and upgraded mesa to 25.0.1 due to this bug report:
https://bugs.gentoo.org/947181
At the moment I do not have any freezes, but I did not test for a longer time.
Maybe downgrading kernel and upgrading mesa is worth a try. |
|
Back to top |
|
 |
zedxot n00b


Joined: 26 Jul 2022 Posts: 64 Location: Dhaka, Bangladesh
|
Posted: Wed Mar 12, 2025 10:12 pm Post subject: |
|
|
tomtom69 wrote: | I might have the same issue with an AMD 3350G.
Can you check whether your system is accessible via ssh?
When the error occurs here I am able to connect via ssh (even if it is very slow) and I can see a process with kworker/u32:0+amdgpu-reset-dev consuming 100% CPU.
I downgraded kernel to 6.6.74 and upgraded mesa to 25.0.1 due to this bug report:
https://bugs.gentoo.org/947181
At the moment I do not have any freezes, but I did not test for a longer time.
Maybe downgrading kernel and upgrading mesa is worth a try. |
Our issue might not be the same. Cause I've been facing this issue since 6.6.67. And the syslog was showing a amdgpu timeout error on gfx low ring. After adding amdgpu.aspm=0 kernel parameter the freezes was less frequent. I tried several kernels including 6.6.74 (more frequent freezes), 6.6.30 (less frequent freezes), 6.1.127 (same as 6.6.30). Finally on 6.12.16 was showing a different syslog which is amdgpu: Dumping IP State.
UPDATE: After blacklisting radeon, radeonsi and i915, didn't experience any freeze yet. Still testing. _________________ Don't talk to me about old, I walk in eternity! |
|
Back to top |
|
 |
Anon-E-moose Watchman


Joined: 23 May 2008 Posts: 6247 Location: Dallas area
|
Posted: Wed Mar 12, 2025 10:27 pm Post subject: |
|
|
You may need to blacklist other modules. Keep an eye on dmesg output for things not working right. _________________ UM780, 6.12 zen kernel, gcc 13, openrc, wayland |
|
Back to top |
|
 |
pietinger Moderator

Joined: 17 Oct 2006 Posts: 5489 Location: Bavaria
|
|
Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|