Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
amdgpu Ryzen 9950X iGPU crashes wiht REG_WAIT timeout
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
DeIM
Guru
Guru


Joined: 11 Apr 2006
Posts: 445

PostPosted: Tue Feb 25, 2025 8:31 am    Post subject: amdgpu Ryzen 9950X iGPU crashes wiht REG_WAIT timeout Reply with quote

I have actual kernel 6.13.4

Asrock X870E Taichi Lite.
Using iGPU.
It has HDMI and 2 USB-C outpust
I use 3 different displays.
When they all connected (HDMI, USB-C => DVI, USB-C => DP) Display manager often crashes (slim/sddm reloads)
When I dosconnect DP monitor crashes are less frequent.

Code:
[    2.049207] amdgpu 0000:79:00.0: [drm] REG_WAIT timeout 1us * 100 tries - dcn31_program_compbuf_size line:141
[    2.049218] ------------[ cut here ]------------
[    2.049219] WARNING: CPU: 22 PID: 227 at drivers/gpu/drm/amd/amdgpu/../display/dc/hubbub/dcn31/dcn31_hubbub.c:151 dcn31_program_compbuf_size+0x205/0x210
[    2.049223] Modules linked in:
[    2.049224] CPU: 22 UID: 0 PID: 227 Comm: kworker/22:0H Not tainted 6.13.4-gentoo #1
[    2.049226] Hardware name: ASRock X870E Taichi Lite/X870E Taichi Lite, BIOS 3.18.AS02 02/05/2025
[    2.049227] Workqueue: events_highpri dm_irq_work_func
[    2.049229] RIP: 0010:dcn31_program_compbuf_size+0x205/0x210
[    2.049231] Code: 00 85 c0 74 25 83 7c 24 04 00 75 1e 65 48 8b 04 25 28 00 00 00 48 3b 44 24 08 75 12 48 83 c4 10 5b 5d c3 0f 0b e9 77 ff ff ff <0f> 0b eb de e8 02 39 5f 00 cc cc 0f 1f 44 00 00 41 56 53 48 89 fb
[    2.049232] RSP: 0018:ffffa8f3c09d7698 EFLAGS: 00010202
[    2.049233] RAX: 0000000080040a0d RBX: ffff9d3c88663c00 RCX: 0000000000000001
[    2.049234] RDX: 0000000000000000 RSI: ffff9d3c81cdff20 RDI: ffff9d3c89700000
[    2.049234] RBP: 000000000000000d R08: ffffa8f3c09d769c R09: 000000000000000d
[    2.049235] R10: 0000003000000030 R11: ffffa8f3c09d7698 R12: ffff9d3ca0a002a8
[    2.049235] R13: ffff9d3ca0a050c8 R14: ffff9d3ca0400000 R15: ffff9d3c88663c00
[    2.049236] FS:  0000000000000000(0000) GS:ffff9d52ff980000(0000) knlGS:0000000000000000
[    2.049237] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    2.049237] CR2: 0000000000000000 CR3: 000000003b41a000 CR4: 0000000000750ef0
[    2.049238] PKRU: 55555554
[    2.049238] Call Trace:
[    2.049240]  <TASK>
[    2.049242]  ? __warn+0xda/0x1d0
[    2.049244]  ? dcn31_program_compbuf_size+0x205/0x210
[    2.049245]  ? report_bug+0x141/0x1e0
[    2.049246]  ? handle_bug+0x5e/0x90
[    2.049248]  ? exc_invalid_op+0x16/0x40
[    2.049249]  ? asm_exc_invalid_op+0x16/0x20
[    2.049250]  ? dcn31_program_compbuf_size+0x205/0x210
[    2.049251]  ? dcn31_program_compbuf_size+0x1dc/0x210
[    2.049252]  dcn20_optimize_bandwidth+0xff/0x1f0
[    2.049254]  dc_commit_state_no_check+0x1691/0x1a40
[    2.049256]  dc_commit_streams+0x465/0x610
[    2.049257]  amdgpu_dm_atomic_commit_tail+0x6c1/0x3d10
[    2.049259]  ? dm_read_reg_func+0x59/0xc0
[    2.049260]  ? optc1_get_crtc_scanoutpos+0xca/0x100
[    2.049262]  ? dc_stream_get_scanoutpos+0xf6/0x110
[    2.049263]  ? ktime_get+0x4d/0xd0
[    2.049264]  ? amdgpu_display_get_crtc_scanoutpos+0x88/0x160
[    2.049266]  ? amdgpu_display_crtc_idx_to_irq_type+0x20/0x20
[    2.049267]  ? amdgpu_crtc_get_scanout_position+0x29/0x40
[    2.049268]  ? drm_crtc_vblank_helper_get_vblank_timestamp_internal+0xe3/0x470
[    2.049270]  ? wait_for_common+0x198/0x1d0
[    2.049271]  ? drm_crtc_commit_wait+0x32/0x90
[    2.049272]  commit_tail+0xbe/0x2c0
[    2.049274]  drm_atomic_helper_commit+0x24f/0x260
[    2.049275]  drm_atomic_commit+0xb8/0xe0
[    2.049276]  ? __drm_printfn_seq_file+0x20/0x20
[    2.049277]  drm_client_modeset_commit_atomic+0x178/0x200
[    2.049279]  drm_client_modeset_commit_locked+0x45/0x160
[    2.049280]  drm_client_modeset_commit+0x23/0x50
[    2.049281]  drm_fb_helper_hotplug_event+0x13b/0x2b0
[    2.049283]  drm_client_dev_hotplug+0x8b/0x110
[    2.049284]  handle_hpd_irq_helper+0x157/0x190
[    2.049285]  process_scheduled_works+0x1f8/0x440
[    2.049287]  worker_thread+0x24a/0x2f0
[    2.049288]  ? pr_cont_work+0x1c0/0x1c0
[    2.049289]  kthread+0x147/0x160
[    2.049290]  ? kthread_blkcg+0x30/0x30
[    2.049291]  ret_from_fork+0x30/0x40
[    2.049292]  ? kthread_blkcg+0x30/0x30
[    2.049293]  ret_from_fork_asm+0x11/0x20
[    2.049294]  </TASK>
[    2.049295] ---[ end trace 0000000000000000 ]---
...
[ 5511.776162] amdgpu 0000:79:00.0: [drm] REG_WAIT timeout 1us * 100 tries - dcn31_program_compbuf_size line:142
[ 6158.266516] amdgpu 0000:79:00.0: [drm] REG_WAIT timeout 1us * 100 tries - dcn31_program_compbuf_size line:141
[ 6158.991735] usb 3-9: new high-speed USB device number 7 using xhci_hcd
[ 6159.199580] usb 3-9: New USB device found, idVendor=05e3, idProduct=0610, bcdDevice=32.98
[ 6159.199584] usb 3-9: New USB device strings: Mfr=0, Product=1, SerialNumber=0
[ 6159.199585] usb 3-9: Product: USB2.0 Hub
[ 6159.210275] hub 3-9:1.0: USB hub found
[ 6159.213575] hub 3-9:1.0: 4 ports detected
[ 6541.325827] xhci_hcd 0000:13:00.0: WARN: buffer overrun event for slot 3 ep 6 on endpoint
[ 6541.515213] xhci_hcd 0000:13:00.0: WARN: buffer overrun event for slot 3 ep 6 on endpoint
[ 7277.741895] usb 3-9: USB disconnect, device number 7
[ 7872.061253] amdgpu 0000:79:00.0: [drm] REG_WAIT timeout 1us * 100 tries - dcn31_program_compbuf_size line:142
[75846.506207] amdgpu 0000:79:00.0: [drm] REG_WAIT timeout 1us * 100 tries - dcn31_program_compbuf_size line:141
[75847.252847] usb 3-9: new high-speed USB device number 9 using xhci_hcd
[75847.462715] usb 3-9: New USB device found, idVendor=05e3, idProduct=0610, bcdDevice=32.98
[75847.462724] usb 3-9: New USB device strings: Mfr=0, Product=1, SerialNumber=0
[75847.462726] usb 3-9: Product: USB2.0 Hub
[75847.473873] hub 3-9:1.0: USB hub found
[75847.477610] hub 3-9:1.0: 4 ports detected
[78649.144667] amdgpu 0000:79:00.0: [drm] REG_WAIT timeout 1us * 100 tries - dcn31_program_compbuf_size line:141
[80103.480423] amdgpu 0000:79:00.0: [drm] REG_WAIT timeout 1us * 100 tries - dcn31_program_compbuf_size line:141
[ 5511.776162] amdgpu 0000:79:00.0: [drm] REG_WAIT timeout 1us * 100 tries - dcn31_program_compbuf_size line:142
[ 6158.266516] amdgpu 0000:79:00.0: [drm] REG_WAIT timeout 1us * 100 tries - dcn31_program_compbuf_size line:141
[ 6158.991735] usb 3-9: new high-speed USB device number 7 using xhci_hcd
[ 6159.199580] usb 3-9: New USB device found, idVendor=05e3, idProduct=0610, bcdDevice=32.98
[ 6159.199584] usb 3-9: New USB device strings: Mfr=0, Product=1, SerialNumber=0
[ 6159.199585] usb 3-9: Product: USB2.0 Hub
[ 6159.210275] hub 3-9:1.0: USB hub found
[ 6159.213575] hub 3-9:1.0: 4 ports detected
[ 6541.325827] xhci_hcd 0000:13:00.0: WARN: buffer overrun event for slot 3 ep 6 on endpoint
[ 6541.515213] xhci_hcd 0000:13:00.0: WARN: buffer overrun event for slot 3 ep 6 on endpoint
[ 7277.741895] usb 3-9: USB disconnect, device number 7
[ 7872.061253] amdgpu 0000:79:00.0: [drm] REG_WAIT timeout 1us * 100 tries - dcn31_program_compbuf_size line:142
[75846.506207] amdgpu 0000:79:00.0: [drm] REG_WAIT timeout 1us * 100 tries - dcn31_program_compbuf_size line:141
[75847.252847] usb 3-9: new high-speed USB device number 9 using xhci_hcd
[75847.462715] usb 3-9: New USB device found, idVendor=05e3, idProduct=0610, bcdDevice=32.98
[75847.462724] usb 3-9: New USB device strings: Mfr=0, Product=1, SerialNumber=0
[75847.462726] usb 3-9: Product: USB2.0 Hub
[75847.473873] hub 3-9:1.0: USB hub found
[75847.477610] hub 3-9:1.0: 4 ports detected
[78649.144667] amdgpu 0000:79:00.0: [drm] REG_WAIT timeout 1us * 100 tries - dcn31_program_compbuf_size line:141
[80103.480423] amdgpu 0000:79:00.0: [drm] REG_WAIT timeout 1us * 100 tries - dcn31_program_compbuf_size line:141


Hmm. I see usb hub in display reloads. I'll try disconnect this display, may be the problem is this display...
Back to top
View user's profile Send private message
MickeyM
n00b
n00b


Joined: 27 Feb 2025
Posts: 1

PostPosted: Thu Feb 27, 2025 2:14 pm    Post subject: Reply with quote

For a bleeding-edge kernel version, you should also accept ~amd64 at linux-firmware. Did you do that?

My system uses a Zen4 Ryzen and is running 2 DP Monitors on iGPU via DP daisychain. Since kernel release >=6.10 its running perfectly stable. Before, it was horrible... amdgpu crashes and spontaneous system reboots.
Maybe future kernel and firmware versions may help in your case.

What i can also advise you: stay away from power cords with your DP cables or display cables in general. They are emitting EM interference that caused my monitors to regularly crash and loosing connection. Some monitors are likely more sensitive than others but it's always a good decision to separate data cables away from power cables. Especially cable management where they are in direct proximity to each other caused this problem for me. Maybe this helps you, too.
Back to top
View user's profile Send private message
logrusx
Advocate
Advocate


Joined: 22 Feb 2018
Posts: 2768

PostPosted: Thu Feb 27, 2025 4:00 pm    Post subject: Reply with quote

AMDGPU is very volatile, I didn't have reliable wake up from sleep since 6.1.57 or something like that. I was stuck with 6.1.91 which at least offered almost reliable wake up from sleep when plugged in. Only recently someone pointed out on the forums they had this issue solved in 6.13. They solved one, but maybe created others...

Best Regards,
Georgi
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum