View previous topic :: View next topic |
Author |
Message |
Auz n00b
Joined: 01 Jul 2006 Posts: 18
|
Posted: Wed Sep 07, 2011 1:46 pm Post subject: Routing Problem. Arp Flux? Flux-related? Or the opposite? |
|
|
I get to use Gentoo at work (Linux 2.6.39-gentoo-r3 #1 SMP Mon Aug 8 14:51:45 BST 2011), but since one of the suggested fixes for this problem I've got so far is "I can install Windows 7 instead..." I could do with a little outside help.
The network is set up as follows: a private network in the office with a checkpoint firewall to the outside world, plus a VPN to our colo. My problem is my box always picks the firewall to get to a certain machine at the colo.
The route is supposedly set correctly:
Code: | Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 172.17.140.97 0.0.0.0 UG 203 0 0 eth1
127.0.0.0 127.0.0.1 255.0.0.0 UG 0 0 0 lo
169.254.0.0 0.0.0.0 255.255.0.0 U 204 0 0 vboxnet0
172.17.140.0 0.0.0.0 255.255.252.0 U 203 0 0 eth1 |
Pinging most machines at the colo works:
Code: | traceroute to 10.60.2.40 (10.60.2.40), 30 hops max, 60 byte packets
1 172.17.140.96 0.235 ms 0.225 ms 0.219 ms
2 10.99.2.2 2.104 ms 2.102 ms 2.096 ms
3 10.60.2.40 7.234 ms 7.236 ms 7.230 ms |
But one in particular goes the wrong way
Code: | traceroute to 10.60.2.58 (10.60.2.58), 30 hops max, 60 byte packets
1 172.17.140.100 0.254 ms 0.325 ms 0.393 ms
2 80.169.33.169 2.020 ms 2.018 ms 2.649 ms
3 80.169.31.173 5.673 ms 6.039 ms 6.038 ms
4 80.169.31.173 6.032 ms !H * * |
The only thing I can find to suspect is Arp Flux, as .96 and .97 share the same MAC address
Code: | ? (172.17.140.96) at 00:13:72:40:0f:36 [ether] on eth1
? (172.17.140.97) at 00:13:72:40:0f:36 [ether] on eth1 |
There are times when I can reach the machine in question... usually after a reboot and possibly once after I cleared the arp cache, but only briefly before the issue re-asserts itself.
Any help or direction to go on this would be appreciated... |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54805 Location: 56N 3W
|
Posted: Wed Sep 07, 2011 6:14 pm Post subject: |
|
|
Auz,
Code: | The only thing I can find to suspect is Arp Flux, as .96 and .97 share the same MAC address |
Thats badly broken. It is a requirement of basic networking that Mac addresses on any network must be unique.
That applies to the internet too.
That its not your problem is shown by Code: | traceroute to 10.60.2.58 (10.60.2.58), 30 hops max, 60 byte packets
1 172.17.140.100 0.254 ms 0.325 ms 0.393 ms
2 80.169.33.169 2.020 ms 2.018 ms 2.649 ms
3 80.169.31.173 5.673 ms 6.039 ms 6.038 ms
4 80.169.31.173 6.032 ms !H * * |
The packet is set to the correct, 'next hop', now its out of your hands as you no longer influence the route. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
Auz n00b
Joined: 01 Jul 2006 Posts: 18
|
Posted: Wed Sep 07, 2011 7:08 pm Post subject: |
|
|
Thanks for the quick reply...
Quote: | Thats badly broken. It is a requirement of basic networking that Mac addresses on any network must be unique. |
I'd agree... there's apparently some rational they have for it, and nobody else is running into problems with it, so persuading them to change might be fun.
Meanwhile...
Quote: | That its not your problem is shown by
Code: | traceroute to 10.60.2.58 (10.60.2.58), 30 hops max, 60 byte packets
1 172.17.140.100 0.254 ms 0.325 ms 0.393 ms
2 80.169.33.169 2.020 ms 2.018 ms 2.649 ms
3 80.169.31.173 5.673 ms 6.039 ms 6.038 ms
4 80.169.31.173 6.032 ms !H * * |
The packet is set to the correct, 'next hop', now its out of your hands as you no longer influence the route. |
It's the first hop that's wrong. It should go to 172.17.140.96. Then if the target is outside, hit .100, eg for bbc.co.uk
Code: | traceroute to bbc.co.uk (212.58.241.131), 30 hops max, 60 byte packets
1 172.17.140.96 0.207 ms 0.199 ms 0.201 ms
2 172.17.140.100 0.351 ms 0.434 ms 0.510 ms
3 80.169.33.169 4.569 ms 4.581 ms 4.590 ms
4 80.169.31.173 6.568 ms 6.576 ms 8.289 ms
5 195.66.224.103 10.530 ms 10.529 ms 10.541 ms
6 212.58.238.129 10.556 ms 10.488 ms 10.528 ms
7 212.58.241.131 10.482 ms 10.415 ms 10.356 ms |
And when things are working (as now, after a reboot)
Code: | traceroute to 10.60.2.58 (10.60.2.58), 30 hops max, 60 byte packets
1 172.17.140.96 0.185 ms 0.188 ms 0.185 ms
2 10.99.2.2 1.986 ms * *
3 10.60.2.58 2.082 ms 2.186 ms 2.174 ms |
|
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54805 Location: 56N 3W
|
Posted: Wed Sep 07, 2011 7:41 pm Post subject: |
|
|
Auz,
I should have read your post more carefully. Sorry about that.
Code: | Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 172.17.140.97 0.0.0.0 UG 203 0 0 eth1
127.0.0.0 127.0.0.1 255.0.0.0 UG 0 0 0 lo
169.254.0.0 0.0.0.0 255.255.0.0 U 204 0 0 vboxnet0
172.17.140.0 0.0.0.0 255.255.252.0 U 203 0 0 eth1 |
Code: | traceroute to 10.60.2.58 (10.60.2.58), 30 hops max, 60 byte packets
1 172.17.140.100 | How does 172.17.140.100 get to be a next hop to anywhere for you?
Its not not one of the gateways listed in your routing table.
For 172.17.140.0/22 you don't need a gateway
For 127.0.0.0/8 the gateway is 127.0.0.1 (localhost)
For 169.254.0.0/16 you don't use a gateway
Everything else goes to 172.17.140.97 as your next hop. 172.17.140.100 is not mentioned.
I wonder how that gets into your routing table, or gets used as a next hop if its not there ? _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
Auz n00b
Joined: 01 Jul 2006 Posts: 18
|
Posted: Wed Sep 07, 2011 9:21 pm Post subject: |
|
|
Quote: | I wonder how that gets into your routing table, or gets used as a next hop if its not there ? |
I don't know... any suggestions as to where to look? I've run tcpdump, but I'm not sure what to look for. |
|
Back to top |
|
|
NeddySeagoon Administrator
Joined: 05 Jul 2003 Posts: 54805 Location: 56N 3W
|
Posted: Wed Sep 07, 2011 10:26 pm Post subject: |
|
|
Auz,
You could try a dirty hack.
Your system should never communicate with 172.17.140.100 directly
Add a host route to direct traffic for 172.17.140.100 to 172.17.140.97.
Its a bit odd that he two next hops towards the internet are in the same subnet and in the subnet you are on, as is shown by
Code: | traceroute to bbc.co.uk (212.58.241.131), 30 hops max, 60 byte packets
1 172.17.140.96 0.207 ms 0.199 ms 0.201 ms
2 172.17.140.100 0.351 ms 0.434 ms 0.510 ms |
172.17.140.96 is your odd gateway that shares a MAC address with 172.17.140.97 which is in your routing table.
172.17.140.100 is a gateway to the outside world.
Better yet might be a static route to 10.0.0.0/8, via 172.17.140.96 or whatever netmask you need for your application _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|