Just a heads-up in case someone has this problem with their #Realtek RTL8168 NIC on #Linux!
Last December I discovered that one of the two NICs in a router / firewall PC would sporadically trigger the NETDEV Watchdog and then become soft-locked / unusable until a system reboot (for details see the linked toot below).
Analysis back then didn't give any conclusive leads but got me to switch from the in-tree #r8169 driver to #r8168-dkms (which didn't entirely fix the issue but at least "only" caused the NIC to lose carrier sporadically every few hours for only 1-2 seconds and then return to working).
I found a #solution the other day!:
use #r8168-dkms drivers (r8169 can't disable EEE?)
add “options r8168 eee_enabled=0" to /etc/modprobe.d/r8168 or your kernel parameters
if needed, rebuild your initramfs
I don't know why but disabling Energy Efficient Ethernet (EEE) resolves the random carrier loss issues.
https://indiepocalypse.social/@heals/113724312920257181
Heals :heart_nb: (@[email protected])
Hey #Linux friends out there - I could use some opinions / input on something I’ve been brooding over for a few days! I have a small Intel N100 based server running various services / automations at my parent’s house. The box has a double-NIC running as a transparent bridge with some filtering and other network management applied. Both NICs are identical Realtek on-board chips (10ec:8168 / sub: 10ec:0123) normally running on the in-tree #r8169 driver on kernel 6.12.6: > r8169 0000:01:00.0 eth0: RTL8168h/8111h, XX:XX:XX:XX:XX:XX, XID 541, IRQ 142 > r8169 0000:01:00.0 eth0: jumbo features [frames: 9194 bytes, tx checksumming: ko] > r8169 0000:03:00.0 eth1: jumbo features [frames: 9194 bytes, tx checksumming: ko8169 0000:03:00.0 eth1: jumbo features [frames: 9194 bytes, tx checksumming: ko] One of them (eth1 / enp3s0) is regularly tossing me these errors: > r8169 0000:03:00.0 enp3s0: rtl_txcfg_empty_cond == 0 (loop: 42, delay: 100). > r8169 0000:03:00.0 enp3s0: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100). > r8169 0000:03:00.0 enp3s0: NETDEV WATCHDOG: CPU: 2: transmit queue 0 timed out 5317 ms As far as I can tell, when this happens half of the bridge silently stops working. I can reach the PC from “my side" which is connected to a router on eth0 / enp1s0 but devices on “the other side” are unreachable until I reboot. Searching online wasn't very helpful at all as the main solution other users with this issue get is "replace the NIC with something not Realtek!” - yeah, no, I can’t. There's also [bug reports](https://bugzilla.kernel.org/show_bug.cgi?id=209839) on kernel.org going as far back as 2020 but no clear solutions. I turned off ASPM on the system and eventually switched to the r8168-DKMS drivers. On #r8168 the link will go down for 1-3 seconds but then fully recover. Not a great solution but a workaround I can live with for now. Anyone got any ideas / similar experiences that could help shed some light on the problem?

