View previous topic :: View next topic |
Author |
Message |
Vieri l33t
Joined: 18 Dec 2005 Posts: 887
|
Posted: Thu May 26, 2022 1:42 pm Post subject: network latency |
|
|
Hi,
I have 3 Gentoo VoIP servers, 1 Gentoo inner firewall and 1 Gentoo ISP gateway.
The switching backbone is a mix of Cisco, D-Link and Ubiquiti Unifi.
Some clients (eg. VoIP phones) have networking issues in that a ping from any of the Gentoo servers shows high latency values (300-800ms). There can also be packet loss (40%).
For some reason, it seems that these issues occur only when the clients are connected to 8-port or 16-port "mini" switches. These unmanaged 1Gbps switches (with very little traffic and hosts connected) are unfortunately necessary until the wired network is extended correctly.
The only way I can get these clients to work well (no latency, no audio drops when on a VoIP call, etc.) is to force their NIC eth negotiation to 100Mbps.
On the other hand, these devices work fine with auto-negotiation or 1Gbps full-duplex when connected directly to ports on managed access switches.
I tried tweaking the clients (firmware config in case of VoIP phones or printers), but I'm still facing the same problem.
This behavior has been noticed on devices of all brands and models, so it seems to be a network problem, not a client firmware or hardware issue.
Also, the "mini" unmanaged switches are also of different brands and models.
What should I be looking for? What can I try?
It's really puzzling because I don't see why I get these results.
I even tried connecting just one single device to an unmanaged switch, but I still saw the same behavior.
The access switch I connected the unmanaged switch to is a D-Link with no special config (no bpdu-guard features or the likes).
Any ideas are greatly appreciated.
Regards,
Vieri
[EDIT]
Actually, it seems that there's something "poisoning" the network (but it only affects a few devices connected as I said) because I just noticed that after 3:30 pm the latency issues suddenly went away. I'm guessing that it could be because a device or group of devices shut down. I'm also guessing that the problem will surface again tomorrow morning as soon as "something" comes back on-line.
If I were to try to capture all the traffic that reaches, say, the Gentoo firewall which is the default gateway for all devices, what kind of traffic should I be searching for to try to find the source that may be causing this trouble?
[EDIT 2]
It seems that I could have found the culprit. There's a Linux Mint desktop in my default vlan that seems to have something to do with it all. IF I halt it or disconnect its NIC there are no more network issues. Rebooting it or reconnecting the NIC doesn't have necessarily an immediate effect on all the "failing" devices, but after a while they eventually start failing. Stopping or disconnecting the NIC of this Linux Mint host immediately solves my network woes.
The OS is updated.
What can I try next?
I could grab a tcpdump trace on the Gentoo default gateway, but even if I did, what would I be looking for? |
|
Back to top |
|
|
alamahant Advocate
Joined: 23 Mar 2019 Posts: 3879
|
|
Back to top |
|
|
pietinger Moderator
Joined: 17 Oct 2006 Posts: 4253 Location: Bavaria
|
Posted: Thu May 26, 2022 7:12 pm Post subject: Re: network latency |
|
|
Vieri wrote: | [EDIT 2]
It seems that I could have found the culprit. There's a Linux Mint desktop in my default vlan that seems to have something to do with it all. IF I halt it or disconnect its NIC there are no more network issues. Rebooting it or reconnecting the NIC doesn't have necessarily an immediate effect on all the "failing" devices, but after a while they eventually start failing. Stopping or disconnecting the NIC of this Linux Mint host immediately solves my network woes.
[...]
but even if I did, what would I be looking for? |
You will look for many multicast or/and broadcast packets ... this shouts to you: I will do a broadcast storm.
(If this mint desktop has two (or more) interfaces there would be another possibility: Fighting for Routes)
In every case its a configuration problem (of this mint desktop).
(Extremly very Rare: It could be another problem - I had only once in my lifetime: We had TokenRing Adapters with exact the same burned-in MAC-address ... dont ask how long it took me to find ... I never thought before this is possible) |
|
Back to top |
|
|
pingtoo l33t
Joined: 10 Sep 2021 Posts: 932 Location: Richmond Hill, Canada
|
Posted: Thu May 26, 2022 8:55 pm Post subject: |
|
|
My wild guess Your Linux Mint desktop have duplicated IP with something on you network. |
|
Back to top |
|
|
Vieri l33t
Joined: 18 Dec 2005 Posts: 887
|
Posted: Fri May 27, 2022 8:26 am Post subject: |
|
|
For now I can say that the Linux Mint system is more and more suspicious because I haven't had one single network issue since it was shut down. It's a standard up-to-date single-NIC DHCP desktop client (standalone, not joined to an AD domain). No special services installed.
When the network woes occur the Linux Mint host is not necessarily doing anything (at least user-triggered). All I need to do is boot the system to get into trouble (after several minutes running).
I don't think QoS is the solution, or at least it wouldn't solve the underlying issue.
I also don't think it's a duplicate IP addr. issue because:
1) it's in DHCP so it is "unlikely" going to be served a duplicate IP addr. There are no other static IP addr. in that DHCP range.
2) As soon as the Linux Mint host went off-line I checked the network for another host with the same IP addr. and found none (nmap). I did it again today -- still nothing.
3) If there were an IP addr. conflict the booting host would not get network access, and it would not poison the rest of the network. The Linux Mint machine connects fine, can browse the network just fine, runs sshd to which I can connect fine and stay connected hours on end.
A broadcast storm is definitely something I'd like to look into.
Next time I boot the system I'd like to:
a) ps aux
to see all the running processes
b) netstat -na
to see all the connections
c) tcpdump -n -i any (actually, there's only one interface)
Now, in this case how do I filter for broadcast storms?
I don't think I should be looking for Layer 3 broadcasts, but then again I can't rule out anything.
Are Layer 2 broadcasts simply ARP requests?
I think I can run something like this directly on the Linux Mint system:
Code: | tcpdump -n -i myNIC broadcast |
However, once I have the trace what should I be looking for?
I will probably see ARP messages such as:
Request who-has x tell y
and other UDP packets
but where can I draw the line between normal and "evil" behavior?
I also read somewhere that "broadcast storms" can be caused by defective wiring and/or cheap unmanaged switches.
The system IS connected to an 8-port unmanaged switch, and I might also try to change the eth cable that connects this host to the switch.
Before doing so and if a broadcast storm were the issue, would a tcpdump ***on the host*** actually reveal anything? Or is it the unmanaged switch that's creating havoc (but only when the Linux Mint system is on-line because I have other systems connected to the same switch and no issues there)?
[EDIT]
I can then try:
Code: | tcpdump -n -i myNIC multicast |
Still,what should I be looking for (in terms of anomalies, of course)? |
|
Back to top |
|
|
pietinger Moderator
Joined: 17 Oct 2006 Posts: 4253 Location: Bavaria
|
Posted: Fri May 27, 2022 8:41 am Post subject: |
|
|
Vieri wrote: | I also read somewhere that "broadcast storms" can be caused by defective wiring and/or cheap unmanaged switches.
The system IS connected to an 8-port unmanaged switch, and I might also try to change the eth cable that connects this host to the switch. |
First of all I would try is: Change only cable between Mint and Switch (always do only one change at one time). If there will stay the problem I would try sniffing.
I can tell you what I did when searching in a small ethernet segment (= do not in a backone segment; in backbone you must work with mirror port in switch):
1. Changed the switch with an old hub (so I have all the traffic on all ports) AND
2. Reconnected all stations from switch to hub AND
3. Connected my (super fast and expensive company) notebook with Sniffer (later I used ethereal; now named "wireshark") to an empty port of hub, and no filterfor gathering packets (so I have all packets). Later you can filter to MAC of Mint station.
This was usefull to see also all Layer-2-packets. (Dont forget: If it is really a broadcast storm you will not capture all packets - but you see with the rest of packets that it was a storm).
Last edited by pietinger on Fri May 27, 2022 8:49 am; edited 3 times in total |
|
Back to top |
|
|
Vieri l33t
Joined: 18 Dec 2005 Posts: 887
|
Posted: Fri May 27, 2022 8:48 am Post subject: |
|
|
pietinger wrote: | but you see with the rest of packets that it was a storm). |
How do you "see" that it's a storm?
For the sheer amount? |
|
Back to top |
|
|
pietinger Moderator
Joined: 17 Oct 2006 Posts: 4253 Location: Bavaria
|
Posted: Fri May 27, 2022 8:51 am Post subject: |
|
|
Vieri wrote: | How do you "see" that it's a storm?
For the sheer amount? |
Yes. It is the amount of packets AND you will see its always the SAME (e.g. ARP pingpong; or traffic from other VLANs because a wrong VLAN-ID; or many reconnects, or ...). |
|
Back to top |
|
|
pietinger Moderator
Joined: 17 Oct 2006 Posts: 4253 Location: Bavaria
|
Posted: Fri May 27, 2022 8:59 am Post subject: |
|
|
P.S.: Vieri, please see my edit of my post before my last post. |
|
Back to top |
|
|
|