rc39 unlikelhy output for ESP flows
Carter Bullard
carter at qosient.com
Tue Feb 27 23:42:22 EST 2007
The loss data looks like its isn;t getting ntohs()'s properly.
So the little endian, is probably biteing us here.
The INT will be there if there is no return traffic. I agree it
should probably be a CON, but the dir will be "->" because
the flow is not always bi-directional. You have a few ESP
flows that are bi-directional. That just means that the SPI
was the same for both directions of the ESP flow.
A '*' in a field means that the number/string is bigger than the
width used to print the value. For ESP traffic, the dport is
the ESP SPI, which is a 32-bit random number. The dport
is printed as a decimal number so you'll have to translate to
get the hex value.
So argus is a good tool to do loss analysis. Lets fix the values so that
they are useful. If you can share the packet files, I'll fix the loss bugs.
Carter
Christoph Badura wrote:
>Hey Carter,
>
>last week I was trying to get a grip on ESP flows with rc.39.
>I captured packet traces with tcpdump, ran "argus -r trace.cap -w trace.argus"
>over them and looked at the results with ra() and racluster().
>This was all done on a i386 laptop, i.e. a little endian machine should it
>matter.
>
>I got some funny looking output.
>
>typical records from "ra -n -s +sloss +dloss -r trace.argus" are:
>
> 16:50:22.072174 F esp 217.115.67.22 <-> 194.127.190.2.20119* 1758 32 216196 25600 CON 31748770 0
> 16:50:22.053544 F esp 194.127.190.2 -> 217.115.67.22.40081* 5653 0 4297406 0 INT 12723374 0
>
>No, this wasn't initial, there were thousands of packets before that, no
>IKE SA renegotiations and, of course, the SPI's didn't change. I think
>those should all be listed as "CON" and "<->".
>
> 16:50:24.295125 udp 217.115.67.22.500 <-> 194.127.190.2.500 1 1 110 110 CON 0 0
> 16:50:27.468929 F esp 217.115.67.22 <-> 194.127.190.2.20119* 1223 20 158834 16000 CON 23920001 0
> 16:50:27.601715 F esp 194.127.190.2 -> 217.115.67.22.40081* 4770 0 3756048 0 INT 5410065 0
> 16:50:29.338213 udp 217.115.67.22.500 <-> 194.127.190.2.500 1 1 110 110 CON 0 0
> 16:50:32.469992 F esp 217.115.67.22 <-> 194.127.190.2.20119* 2349 42 288150 33600 CON 50176290 0
> 16:50:32.602202 F esp 194.127.190.2 -> 217.115.67.22.40081* 8576 0 6765172 0 INT 9145791 0
> 16:50:34.399414 udp 217.115.67.22.500 <-> 194.127.190.2.500 1 1 110 110 CON 0 0
> 16:50:37.472861 F esp 217.115.67.22 <-> 194.127.190.2.20119* 513 18 64342 14400 CON 11698440 0
> 16:50:37.603317 F esp 194.127.190.2 -> 217.115.67.22.40081* 1551 0 1217694 0 INT 2259263 0
> 16:50:39.449373 udp 217.115.67.22.500 <- 194.127.190.2.500 0 1 0 110 RSP 0 0
>
>What is the significance of "20119*" and "40081*"? I think that was
>discussed on the list a while ago, but I can't find it anymore. The
>relation to the SPIs (0x17e3e6f7 and 0x0bfe0e52) isn't obvious to me.
>
>Note the completely bogus sloss values.
>
>"racluster -n -s +sloss +dloss -r trace.argus" gives:
>
> 16:49:21.850751 F esp 217.115.67.22 <-> 194.127.190.2.20119* 21668 2836 2912456 2268544 CON 257404441 0
> 16:49:21.853440 esp 194.127.190.2 -> 217.115.67.22.40081* 76255 0 58849978 0 INT 67834978 0
> 16:49:22.858836 udp 217.115.67.22.500 <-> 194.127.190.2.500 11 16 1210 1760 CON 0 0
>
>This should be one ESP and one UDP flow, I think. But the behaviour is
>probably consistent with what can be expected from the output of ra().
>
>Sorry, I haven't had time to check out rc.40 yet. Hopefully tomorrow
>evening.
>
>There's one question I have, though. As you can see from the F flag,
>these ESP flows have fragments. Lots of fragments, actually. They are
>from 1480 byte TCP segments being fragmented on the tunnel entry point.
>What I am currently interested in is trying to figure out how I can use
>Argus (or a different tool) to monitor the packet loss rate of VPN
>tunnels and TCP connections in general. So, assuming I have identified a
>number of argus records with non-empty loss fields (perhaps by splitting
>them into time buckets with rabins()), is there a way to sort of "drill down"
>and get a view of the flows at the IP level?
>
>The idea behind this is, that I think it is easier to get a grip on badly
>performing flows with a tool like argus than with collecting performance
>and packet drop data from really large numbers of routers and switches.
>Many of them will be beyond the local organization's border routers and
>hence not pollable! And still I'd want to know if my traffic flows going
>off-site are behaving normally or badly because I might have SLAs
>involving them.
>
>I guess your suggestion earlier this year to experiment with monitoring
>the performance of traffic flow for on-line games goes in the same direction.
>
>Maybe this should be moved to a separate thread.
>
>--chris
>
>
>
More information about the argus
mailing list