VPN problems

Matthew Radey webmaster at freejazz.org
Tue Apr 5 23:00:46 UTC 2005


Greetings. I've been trying for some time now to get a net-to-net VPN to
work, but I'm running into some kind of packet loss or misdirection on
my gateways. Here is the setup:


  eth0       eth0     eth1          eth0     eth1       eth0
10.1.1.2---10.1.1.1--A.B.C.D======E.F.G.H--10.2.1.1---10.2.1.2
 client1        gateway1              gateway2         client2


So let's say I try to ping client2 from client1. With tcpdump I can
watch the packets go through the external interface of gateway1. Each
ping seems to produce six packets:

06:21:57.322443 IP (tos 0x0, ttl  63, id 29281, offset 0, flags [DF],
proto 51, length: 180) A.B.C.D > E.F.G.H:
AH(spi=0x059f0f7a,sumlen=16,seq=0x3a): IP (tos 0x0, ttl  64, id 15411,
offset 0, flags [DF], proto 50, length: 136) A.B.C.D > E.F.G.H:
ESP(spi=0x0a3f13cf,seq=0x3a)
06:21:57.322443 IP (tos 0x0, ttl  64, id 15411, offset 0, flags [DF],
proto 50, length: 136) A.B.C.D > E.F.G.H: ESP(spi=0x0a3f13cf,seq=0x3a)
06:21:57.322443 IP (tos 0x0, ttl  63, id 57, offset 0, flags [DF], proto
1, length: 84) 10.95.244.2 > 10.95.211.2: icmp 64: echo request seq 57
06:21:57.342832 IP (tos 0x0, ttl  64, id 227, offset 0, flags [DF],
proto 17, length: 544) E.F.G.H.isakmp > A.B.C.D.isakmp: isakmp 1.0 msgid
: phase 2/others ? oakley-quick[E]: [encrypted hash]
06:21:57.385329 IP (tos 0x0, ttl  63, id 114, offset 0, flags [DF],
proto 17, length: 360) A.B.C.D.isakmp > E.F.G.H.isakmp: isakmp 1.0 msgid
: phase 2/others ? oakley-quick[E]: [encrypted hash]
06:21:57.386213 IP (tos 0x0, ttl  64, id 228, offset 0, flags [DF],
proto 17, length: 88) E.F.G.H.isakmp > A.B.C.D.isakmp: isakmp 1.0 msgid
: phase 2/others ? oakley-quick[E]: [encrypted hash]

More often it's only four packets, though, where the packets of length
136 and 84 are not there.

On gateway2, each ping seems to produce only three packets, so it's
missing the ones with a length of 180, 136, and 84:

13:03:31.355526 IP (tos 0x0, ttl  63, id 1949, offset 0, flags [DF],
proto 17, length: 544) E.F.G.H.isakmp > A.B.C.D.isakmp: isakmp 1.0 msgid
: phase 2/others ? oakley-quick[E]: [encrypted hash]
13:03:31.399320 IP (tos 0x0, ttl  64, id 1166, offset 0, flags [DF],
proto 17, length: 360) A.B.C.D.isakmp > E.F.G.H.isakmp: isakmp 1.0 msgid
: phase 2/others ? oakley-quick[E]: [encrypted hash]
13:03:31.401131 IP (tos 0x0, ttl  63, id 1950, offset 0, flags [DF],
proto 17, length: 88) E.F.G.H.isakmp > A.B.C.D.isakmp: isakmp 1.0 msgid
: phase 2/others ? oakley-quick[E]: [encrypted hash]

On client2, all appears to be well. It receives a ping and responds to
it. But client1 never receives that response, and sniffing eth0 (the
internal interface) on gateway1 shows the outgoing ping but no ping
response being sent back to client1.

Now I reverse it, trying to ping client1 from client2. gateway2 behaves
the same way, still missing the 180, 136, and 84 packets:

06:13:11.805316 IP (tos 0x0, ttl  64, id 97, offset 0, flags [DF], proto
17, length: 544) E.F.G.H.isakmp > A.B.C.D.isakmp: isakmp 1.0 msgid :
phase 2/others I oakley-quick[E]: [encrypted hash]
06:13:11.847381 IP (tos 0x0, ttl  63, id 49, offset 0, flags [DF], proto
17, length: 360) A.B.C.D.isakmp > E.F.G.H.isakmp: isakmp 1.0 msgid :
phase 2/others R oakley-quick[E]: [encrypted hash]
06:13:11.848253 IP (tos 0x0, ttl  64, id 98, offset 0, flags [DF], proto
17, length: 88) E.F.G.H.isakmp > A.B.C.D.isakmp: isakmp 1.0 msgid :
phase 2/others I oakley-quick[E]: [encrypted hash]

...and now it has gateway1 behaving the same way too:

14:08:27.937838 IP (tos 0x0, ttl  63, id 81, offset 0, flags [DF], proto
17, length: 544) E.F.G.H.isakmp > A.B.C.D.isakmp: isakmp 1.0 msgid :
phase 2/others I oakley-quick[E]: [encrypted hash]
14:08:27.980288 IP (tos 0x0, ttl  64, id 41, offset 0, flags [DF], proto
17, length: 360) A.B.C.D.isakmp > E.F.G.H.isakmp: isakmp 1.0 msgid :
phase 2/others R oakley-quick[E]: [encrypted hash]
14:08:27.981904 IP (tos 0x0, ttl  63, id 82, offset 0, flags [DF], proto
17, length: 88) E.F.G.H.isakmp > A.B.C.D.isakmp: isakmp 1.0 msgid :
phase 2/others I oakley-quick[E]: [encrypted hash]

No ping is received by client1 in this case, so the tunnel is even less
fully formed than it is in the original scenario.

One interesting thing I noticed is the kernel routing table. gateway1
uses eth1 as the external interface, and gateway2 uses eth0, but the
/etc/sysconfig/network-scripts/ifup script seems to choose the last one
it brings up for the APIPA destination, regardless of whether it's the
internal or external interface:

Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use
Iface
A.B.C.0         *               255.255.255.0   U     0      0        0 eth1
10.1.1.0        *               255.255.255.0   U     0      0        0 eth0
169.254.0.0     *               255.255.0.0     U     0      0        0 eth1
default         I.J.K.L         0.0.0.0         UG    0      0        0 eth1

Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use
Iface
E.F.G.0         *               255.255.255.0   U     0      0        0 eth0
10.2.1.0        *               255.255.255.0   U     0      0        0 eth1
169.254.0.0     *               255.255.0.0     U     0      0        0 eth1
default         M.N.O.P         0.0.0.0         UG    0      0        0 eth0

Also there's no lo interface, which seems odd to me since it shows up
when I run ifconfig. How either of these would effect my VPN I'm not
sure, though I do see messages like these in syslog that make me wonder:

racoon: INFO: 127.0.0.1[500] used as isakmp port (fd=10)

I tried setting the APIPA destination so that it was the external
interface on both machines with '/sbin/ip route replace 169.254.0.0/16
dev eth0' on E.F.G.H. If I repeat the client1 to client2 ping after
this, the only change is that gateway2 is no longer missing the 180
length packet. If I change the APIPA destination to the internal
interface on both machines and repeat the client1 to client2 ping,
gateway2 shows all six of the original packets, but poor gateway1 now
only shows packets with a 180 length. Doing a client2 to client1 ping
with the internal interfaces as the APIPA destination shows both
gateway1 and gateway2 missing the 180, 136, and 84 packets, and client1
never receives the ping.

Other notes:

1. Both gateway boxes are running ipsec-tools-0.3.3-6 and kernel
2.6.9-5.0.3.EL.

2. I've been able to repeat this whole scenario by putting in a third
gateway and client at a totally different location and having it try to
VPN to either of the other gateways.

3. I've added rules to iptables to make it log packets before dropping
them, but it's not showing anything being dropped.

4. racoon configuration on both gateways has been carefully checked to
be sure keys, algorithms, etc are in sync.

5. NAT is working for the clients. They can reach the Internet through
the gateways.

I'm about ready to throw up my hands. What am I missing?


Matthew




More information about the redhat-list mailing list