View previous topic :: View next topic |
Author |
Message |
__name__ n00b
Joined: 06 Feb 2024 Posts: 46 Location: where the wind blows
|
Posted: Fri Mar 08, 2024 6:31 pm Post subject: PCI ethernet device issue related to AER |
|
|
On boot dmesg is full of the following message:
Code: | [ 72.199427] pcieport 0000:00:1c.0: AER: Multiple Corrected error received: 0000:01:00.0
[ 72.199458] r8169 0000:01:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
[ 72.199465] r8169 0000:01:00.0: device [10ec:8136] error status/mask=00000001/00006000
[ 72.199473] r8169 0000:01:00.0: [ 0] RxErr (First)
|
This message repeats constantly making for a very large dmesg output.
I found this trick to "fix" the issue but it's not a real fix.
https://gist.github.com/flisboac/5a0711201311b63d23b292110bb383cd
Command I have used to "fix" issue:
Code: | setpci -v -d 10ec:8136 CAP_EXP+0x8.w=0x0e
|
I don't actually use the ethernet on this laptop so it's not critical but the reporting in dmesg is excessive and I would like to eliminate the issue so I can better troubleshoot other issues while refining my kernel config.
Device:
Code: | 01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL810xE PCI Express Fast Ethernet controller (rev 07)
Subsystem: Dell RTL810xE PCI Express Fast Ethernet controller
Kernel driver in use: r8169
Kernel modules: r8169
|
Current firmware used for device:
Code: | rtl_nic/rtl8106e-1.fw |
I appreciate any insight on this issue.
Last edited by __name__ on Tue Mar 12, 2024 4:33 am; edited 1 time in total |
|
Back to top |
|
|
__name__ n00b
Joined: 06 Feb 2024 Posts: 46 Location: where the wind blows
|
Posted: Tue Mar 12, 2024 4:32 am Post subject: |
|
|
I found the first instance of the error. It happens in the first second after boot. It is unrelated to the r8169 driver it seems. Here is the first report:
Code: | [ 0.828786] pcieport 0000:00:1c.0: AER: Multiple Corrected error received: 0000:01:00.0
[ 0.830048] pci 0000:01:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
[ 0.831305] pci 0000:01:00.0: device [10ec:8136] error status/mask=00000001/00006000
[ 0.832524] pci 0000:01:00.0: [ 0] RxErr (First)
|
Does anyone now how to further troubleshoot this issue? |
|
Back to top |
|
|
__name__ n00b
Joined: 06 Feb 2024 Posts: 46 Location: where the wind blows
|
Posted: Thu Mar 14, 2024 3:21 am Post subject: |
|
|
I have been searching for answers to this problem and it seems to be a common bug for the Skylake CPUs and Intel mobos with Sunrise Point host controller. I have found Ubuntu bugs from 2015 with exactly the same issue.
As for a solution, I have just found this webpage that describes the codebase responsible. I will try to dive into the gentoo-sources code to see if I can find the lines but it is a bit above my skill level to implement any sort of fix.
https://lore.kernel.org/all/20151229155822.GA17321@localhost/T/#u |
|
Back to top |
|
|
pietinger Moderator
Joined: 17 Oct 2006 Posts: 5372 Location: Bavaria
|
Posted: Thu Mar 14, 2024 10:08 am Post subject: |
|
|
In the meantime you might disable the PCIE advanced error reporting with a kernel command line parameter: pci=noaer _________________ https://wiki.gentoo.org/wiki/User:Pietinger |
|
Back to top |
|
|
__name__ n00b
Joined: 06 Feb 2024 Posts: 46 Location: where the wind blows
|
Posted: Thu Mar 14, 2024 9:15 pm Post subject: |
|
|
I did that but I also have some new kernel configurations that are creating different error messages in dmesg from the same device. I will have to track those down and recompile my kernel. I'm using gentoo-sources-6.6.21 as of now. |
|
Back to top |
|
|
|