View previous topic :: View next topic |
Author |
Message |
ese002 Apprentice
Joined: 20 Sep 2006 Posts: 155
|
Posted: Mon Dec 21, 2015 1:11 am Post subject: Realtek RTL8111E hangs at startup |
|
|
This being on since probably 3.14.14 and every version since through the current 4.1.12
About one in five resumes from hibernation and, I think even clean boots (though I don't do that often), the ethernet just stops responding at startup. The light on my 10/100 switch shows 10Mbit for that port and there is no response. Ping to any address reports: "kernel is not very fresh" and Destination Host Unreachable
systemctl restart network : accomplishes nothing
Unplugging and replugging the network connection accomplishes nothing.
re-hibernate and resume changes nothing.
Cold reboot generally works. Soft reboot is a bit flakey but once it actually boots, it works. The flakiness may not be related.
The actual device is a Realtek RTL8111E. The driver is r8169.
This what I see /var/log/messages for a fail state:
Dec 20 07:02:57 crab kernel: ------------[ cut here ]------------
Dec 20 07:02:57 crab kernel: WARNING: CPU: 1 PID: 0 at net/sched/sch_generic.c:303 dev_watchdog+0x22f/0x240()
Dec 20 07:02:57 crab kernel: NETDEV WATCHDOG: enp4s0 (r8169): transmit queue 0 timed out
Dec 20 07:02:57 crab kernel: Modules linked in: i915 i2c_algo_bit snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic drm_kms
_helper snd_hda_intel snd_hda_controller drm snd_hda_codec snd_hwdep r8169 mii snd_hda_core snd_pcm snd_timer snd x86_pkg_temp_thermal m
ac_hid efivarfs tg3 e1000 dm_mirror dm_region_hash dm_log dm_mod sr_mod cdrom sg sata_sil24 sata_sil pata_sil680 sd_mod ahci libahci pat
a_hpt37x
Dec 20 07:02:57 crab kernel: CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.1.12-gentoo #1
Dec 20 07:02:57 crab kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z77 Pro3, BIOS P2.10 07/12/2013
Dec 20 07:02:57 crab kernel: ffffffff819decf8 ffff88041f283d78 ffffffff8179ed4a 0000000000000001
Dec 20 07:02:57 crab kernel: ffff88041f283dc8 ffff88041f283db8 ffffffff81095b55 ffff88040cd4ff08
Dec 20 07:02:57 crab kernel: 0000000000000000 ffff88007fb6e000 0000000000000001 0000000000000001
Dec 20 07:02:57 crab kernel: Call Trace:
Dec 20 07:02:57 crab kernel: <IRQ> [<ffffffff8179ed4a>] dump_stack+0x45/0x57
Dec 20 07:02:58 crab kernel: [<ffffffff81095b55>] warn_slowpath_common+0x85/0xc0
Dec 20 07:02:58 crab kernel: [<ffffffff81095bd1>] warn_slowpath_fmt+0x41/0x50
Dec 20 07:02:58 crab kernel: [<ffffffff81589404>] ? intel_pstate_timer_func+0x304/0x3a0
Dec 20 07:02:58 crab kernel: [<ffffffff815ee60f>] dev_watchdog+0x22f/0x240
Dec 20 07:02:58 crab kernel: [<ffffffff815ee3e0>] ? dev_graft_qdisc+0x80/0x80
Dec 20 07:02:58 crab kernel: [<ffffffff810e9259>] call_timer_fn+0x39/0x100
Dec 20 07:02:58 crab kernel: [<ffffffff815ee3e0>] ? dev_graft_qdisc+0x80/0x80
Dec 20 07:02:58 crab kernel: [<ffffffff810e9513>] run_timer_softirq+0x1f3/0x2d0
Dec 20 07:02:58 crab kernel: [<ffffffff810997dd>] __do_softirq+0xed/0x280
Dec 20 07:02:58 crab kernel: [<ffffffff81099b7d>] irq_exit+0x9d/0xb0
Dec 20 07:02:58 crab kernel: [<ffffffff8107dcd5>] smp_apic_timer_interrupt+0x45/0x60
Dec 20 07:02:58 crab kernel: [<ffffffff817a7e3b>] apic_timer_interrupt+0x6b/0x70
Dec 20 07:02:58 crab kernel: <EOI> [<ffffffff81589e83>] ? cpuidle_enter_state+0xa3/0x1e0
Dec 20 07:02:58 crab kernel: [<ffffffff81589e5c>] ? cpuidle_enter_state+0x7c/0x1e0
Dec 20 07:02:58 crab kernel: [<ffffffff81589fe2>] cpuidle_enter+0x12/0x20
Dec 20 07:02:58 crab kernel: [<ffffffff810cce18>] cpu_startup_entry+0x278/0x380
Dec 20 07:02:58 crab kernel: [<ffffffff8107bef3>] start_secondary+0x123/0x130
Dec 20 07:02:58 crab kernel: ---[ end trace f1a9ab274ef30e2f ]---
Dec 20 07:02:58 crab kernel: r8169 0000:04:00.0 enp4s0: rtl_chipcmd_cond == 1 (loop: 100, delay: 100).
Dec 20 07:02:58 crab kernel: r8169 0000:04:00.0 enp4s0: rtl_csiar_cond == 1 (loop: 100, delay: 10).
Dec 20 07:02:58 crab kernel: r8169 0000:04:00.0 enp4s0: rtl_ephyar_cond == 1 (loop: 100, delay: 10).
Dec 20 07:02:58 crab kernel: r8169 0000:04:00.0 enp4s0: rtl_ephyar_cond == 1 (loop: 100, delay: 10).
Dec 20 07:02:58 crab kernel: r8169 0000:04:00.0 enp4s0: rtl_eriar_cond == 1 (loop: 100, delay: 100).
Dec 20 07:02:58 crab kernel: r8169 0000:04:00.0 enp4s0: rtl_eriar_cond == 1 (loop: 100, delay: 100).
Dec 20 07:02:58 crab kernel: r8169 0000:04:00.0 enp4s0: rtl_eriar_cond == 1 (loop: 100, delay: 100).
Dec 20 07:02:58 crab kernel: r8169 0000:04:00.0 enp4s0: rtl_eriar_cond == 1 (loop: 100, delay: 100).
Dec 20 07:02:58 crab kernel: r8169 0000:04:00.0 enp4s0: rtl_eriar_cond == 1 (loop: 100, delay: 100).
Is viable work-around known? I get the impression that this is a kernel/driver bug. |
|
Back to top |
|
|
Tony0945 Watchman
Joined: 25 Jul 2006 Posts: 5127 Location: Illinois, USA
|
Posted: Mon Dec 21, 2015 3:38 am Post subject: |
|
|
Realtek plays fast and loose model numbers. Sometimes the same model number has substantially different hardware. They figure it's all right because they give each mobo manufacturer a custom windows driver.
You could try the r8168 driver (yes r8168) from Realtek's web site and build it as an out of tree driver. That helped a bit for me.
You won't like what i did to solve this problem. Eventually, I got P.O.ed enough to buy an Intel PCI-E card for around $30, blacklisted r8168 and r8169, and rebuilt the kernel using e1000. Like Night and Day. I use up one slot but it purrs like a kitten. Now I try to avoid Realtek mobos. |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54818 Location: 56N 3W
|
Posted: Mon Dec 21, 2015 9:53 am Post subject: |
|
|
ese002,
Many r8169 cards have hardware bugs and are provided with firmware patches to (attempt) to fix them.
If your card can take firmware and you don't provide it, i will mostly still work.
dmesg will show a 60 sec pause at boot as r8169 look around for the firmware.
This Ubuntu bug, in comment 10, suggests that there is firmware for your card. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
ese002 Apprentice
Joined: 20 Sep 2006 Posts: 155
|
Posted: Wed Dec 23, 2015 6:30 pm Post subject: |
|
|
I merged in gentoo-firmware. Other than no longer complaining about not being able to find firmware (previously unmentioned but happening) it does not appear that anything has changed. I get the same kernel messages and this morning it hung up again coming out of hibernation. |
|
Back to top |
|
|
Tony0945 Watchman
Joined: 25 Jul 2006 Posts: 5127 Location: Illinois, USA
|
|
Back to top |
|
|
Logicien Veteran
Joined: 16 Sep 2005 Posts: 1555 Location: Montréal
|
Posted: Thu Dec 24, 2015 11:59 am Post subject: |
|
|
I think you must make a difference between a normal boot and resume from hibernation. If clearly the network card stop to respond in a normal boot, change the module version of r8169 may correct the problem.
If the network card stop to respond only after resume from hibernation, the cause may be only the way the card have been hibernate/resume. Not all devices support power saving.
Note that systemd will not unload/reload the r8169 module when restarting network. You need to do that manually. Than I think your card can respond again. _________________ Paul |
|
Back to top |
|
|
Tony0945 Watchman
Joined: 25 Jul 2006 Posts: 5127 Location: Illinois, USA
|
Posted: Thu Dec 24, 2015 2:54 pm Post subject: |
|
|
Logicien wrote: | I think you must make a difference between a normal boot and resume from hibernation. If clearly the network card stop to respond in a normal boot, change the module version of r8169 may correct the problem.
If the network card stop to respond only after resume from hibernation, the cause may be only the way the card have been hibernate/resume. Not all devices support power saving.
Note that systemd will not unload/reload the r8169 module when restarting network. You need to do that manually. Than I think your card can respond again. |
Yes! I noticed that on both Gentoo and Windows. You must restart the driver with /etc/init.d/net.eth0 (OpenRC) or however systemd does it. This applies to both the Realtek and the Intel. Realtek was giving me inconsistent boots and I thought the same with the OP. |
|
Back to top |
|
|
ese002 Apprentice
Joined: 20 Sep 2006 Posts: 155
|
Posted: Wed Dec 30, 2015 8:45 am Post subject: |
|
|
This issue is not reproducible on demand but before I went away from the Holiday I did manage a fail condition. If I execute the following script, nothing changes:
/usr/bin/systemctl stop network
modprobe -r r8169
modprobe r8169
/usr/bin/systemctl start network
So, unloading and reloading the driver has no effect in the fail case.
I have not yet tried the r8168 driver. |
|
Back to top |
|
|
|