Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
I219-LM networking failure on resume from suspend to RAM
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
markord
n00b
n00b


Joined: 01 Jan 2014
Posts: 4

PostPosted: Fri Dec 08, 2023 5:04 pm    Post subject: I219-LM networking failure on resume from suspend to RAM Reply with quote

While setting up a new Dell Optiplex Micro 7010 I have a problem that the NIC does not appear to work after resuming from suspend to RAM.

After resuming, everything seems to look fine (ifconfig shows ens0p31f6 and it's up), but nothing on the network can be reached.

Code:

Kernel: 6.1.57-gentoo #10 SMP PREEMPT_DYNAMIC
NIC: 00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (17) I219-LM (rev 11)


Initially I had e100e compiled into kernel. I changed it to a module, but that did not help. Restarting the network interface, unloading/reloading the e1000e module and even removing the device from the bus and re-scanning does not restore it to a working the state.

Code:

echo 1 > /sys/bus/pci/devices/0000\:00\:1f.6/remove
echo 1 > /sys/bus/pci/rescan
/etc/init.d/net.enp0s31f6 restart


The only way I can get it working again is to reboot the computer - both rebooting, and hibernating to disk then powering back on fix it.

After resuming from suspend to RAM the network lights illuminate, Everything looks pretty much normal, but it's not possible to ping any other machine. Obviously because the network is not working, DHCP fails, but for debugging purposes I've changed it to use a static IP, though that doesn't help the situation.

When it's not working, the arp table seems inconsistent - when pinging local machines or the broadcast address, some invocations of arp show the other machine's mac addresses, some times it shows them as (incomplete). Even when the mac do addresses appear in the arp table, they're still inaccessible.

The output from ethtool is the same in the working state and non working state:

Code:

Settings for enp0s31f6:
        Supported ports: [ TP ]
        Supported link modes:   10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
        Supported pause frame use: No
        Supports auto-negotiation: Yes
        Supported FEC modes: Not reported
        Advertised link modes:  10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
        Advertised pause frame use: No
        Advertised auto-negotiation: Yes
        Advertised FEC modes: Not reported
        Speed: 1000Mb/s
        Duplex: Full
        Auto-negotiation: on
        Port: Twisted Pair
        PHYAD: 1
        Transceiver: internal
        MDI-X: off (auto)
        Supports Wake-on: pumbg
        Wake-on: g
        Current message level: 0x00000007 (7)
                               drv probe link
        Link detected: yes


ethtool -d reports some differences between the working and non working state, though not sure if this difference is meaningful or not:

Code:

--- dump-working   2023-12-08 09:41:14.000000000 -0000
+++ dump-notworking   2023-12-08 09:42:59.000000000 -0000
@@ -34,8 +34,8 @@ MAC Registers
       Pass MAC control frames:           don't pass
       Receive buffer size:               2048
 0x02808: RDLEN (Receive desc length)     0x00001000
-0x02810: RDH   (Receive desc head)       0x00000048
-0x02818: RDT   (Receive desc tail)       0x00000040
+0x02810: RDH   (Receive desc head)       0x00000088
+0x02818: RDT   (Receive desc tail)       0x00000080
 0x02820: RDTR  (Receive delay timer)     0x00000000
 0x00400: TCTL (Transmit ctrl register)   0x3103F0FA
       Transmitter:                       enabled
@@ -43,7 +43,7 @@ MAC Registers
       Software XOFF Transmission:        disabled
       Re-transmit on late collision:     enabled
 0x03808: TDLEN (Transmit desc length)    0x00001000
-0x03810: TDH   (Transmit desc head)      0x00000086
-0x03818: TDT   (Transmit desc tail)      0x00000086
+0x03810: TDH   (Transmit desc head)      0x00000021
+0x03818: TDT   (Transmit desc tail)      0x00000021
 0x03820: TIDV  (Transmit delay timer)    0x00000008
 PHY type:                                unknown


I'm at a loss with this - does anyone have any ideas?
Back to top
View user's profile Send private message
freifunk_connewitz
Apprentice
Apprentice


Joined: 08 Feb 2006
Posts: 236

PostPosted: Tue Jun 18, 2024 7:31 am    Post subject: Reply with quote

Hi markord,

did you find a solution?

I have a similar problem: after sleep/resume, the NIC stays in a sort of sleeping state: no dhcp or any other network activity because NIC thinks it has no carrier.

Re-plugging the cable shows no reaction in /var/log/messages.

Rebooting helps.

What also always helps: bringing the NIC down and up again manually:
Code:

ifconfig enp0s31f6 down
ifconfig enp0s31f6 up


No error messages in the logs (except from a random "Failed to disable ULP" from time to time).

This behaviour started some time this year, I'm not sure but suspect a kernel update.

System: stable amd64, openrc
Kernel: 6.6.21
NIC: Intel I219-LM, driver: e1000e
Back to top
View user's profile Send private message
freifunk_connewitz
Apprentice
Apprentice


Joined: 08 Feb 2006
Posts: 236

PostPosted: Sat Jul 27, 2024 5:17 pm    Post subject: Reply with quote

This still happens with kernel 6.6.38 and building e1000e as a module.

Also this patch to the driver did not solve it:
https://patchwork.kernel.org/project/netdevbpf/patch/20240429171040.1152516-1-anthony.l.nguyen@intel.com/


edit: the following is obsolete:
What seems to have been resolving it was to recompile kernel and modules and disabling ETHTOOL_NETLINK:
Code:
/usr/src/linux/.config:
# CONFIG_ETHTOOL_NETLINK is not set


Maybe this also helps you?


Last edited by freifunk_connewitz on Tue Sep 03, 2024 6:24 am; edited 1 time in total
Back to top
View user's profile Send private message
markord
n00b
n00b


Joined: 01 Jan 2014
Posts: 4

PostPosted: Tue Aug 06, 2024 9:25 pm    Post subject: Reply with quote

freifunk_connewitz - thanks for the response.

I tried to reproduce the issue today prior to making the suggested kernel configuration change; strangely I can no longer reproduce the issue.

Arguably I can, when the computer comes back from sleep, the network is not responsive for 30 - 32 seconds, but then works without any intervention on my part. I'm not sure whether something has changed since my first post, or if I was just being impatient in December. I doubt it's that though, I swear it did not work previously though (requiring hibernate then power on to bring it back).

ping / suspend to RAM > wake now looks like this:

Code:

64 bytes from 192.168.2.1: icmp_seq=16 ttl=64 time=0.590 ms
64 bytes from 192.168.2.1: icmp_seq=17 ttl=64 time=0.591 ms

64 bytes from 192.168.2.1: icmp_seq=47 ttl=64 time=0.560 ms
64 bytes from 192.168.2.1: icmp_seq=48 ttl=64 time=0.630 ms


The blank line is where I suspend/resumed. It sits there for about 30 seconds, with those 30 packets not being sent, then the network comes back and actually works.

Since December I've switched from netifrc to NetworkManager to manage the network - I don't believe that fixed it though, I switched back to netifrc and still could not reproduce it anymore.

The only other notable changed I've made is enabling bluetooth. I have no idea whether that is what fixed it or not though.

I did try turning off CONFIG_ETHTOOL as suggested. I didn't see any difference, but that's not saying a lot since I can't reproduce the issue anymore.

Maybe having to wait 30 seconds after waking the computer indicates the issue is still there, but it's a much better situation that initially when the network never came back, and I can live with that.

If I learn anymore I'll post it here.
Back to top
View user's profile Send private message
freifunk_connewitz
Apprentice
Apprentice


Joined: 08 Feb 2006
Posts: 236

PostPosted: Tue Sep 03, 2024 6:21 am    Post subject: Reply with quote

Markord,

Thank you for your update. Glad you have resolved the issue somehow for you. My case unfortunately is back to broken. The ETHTOOL kernel configuration did not help. Also waiting changes nothing.

Somehow the Intel NIC goes into some zombie state that only can be ended by invoking
Code:
# ifconfig enp0s31f6 down
# ifconfig enp0s31f6 up


And it even is in this state after a fresh boot, not only after resuming from suspend-2-ram: When I plug in the network cable, the connection indicator at the switch lights up to tell me there is a connection, but on my laptop nothing like "Network connection is up" shows up in /var/log/messages (as it usual does whenever I plug in the cable). In the meantime I'm on kernel 6.6.47-gentoo.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum