rc39 unlikelhy output for ESP flows

Christoph Badura bad at bsd.de
Tue Feb 27 17:44:29 EST 2007


Hey Carter,

last week I was trying to get a grip on ESP flows with rc.39.
I captured packet traces with tcpdump, ran "argus -r trace.cap -w trace.argus"
over them and looked at the results with ra() and racluster().
This was all done on a i386 laptop, i.e. a little endian machine should it
matter.

I got some funny looking output.

typical records from "ra -n -s +sloss +dloss -r trace.argus" are:

   16:50:22.072174       F     esp      217.115.67.22          <->      194.127.190.2.20119*     1758       32       216196        25600   CON   31748770 0
   16:50:22.053544       F     esp      194.127.190.2           ->      217.115.67.22.40081*     5653        0      4297406            0   INT   12723374 0

No, this wasn't initial, there were thousands of packets before that, no
IKE SA renegotiations and, of course, the SPI's didn't change.  I think
those should all be listed as "CON" and "<->".

   16:50:24.295125             udp      217.115.67.22.500      <->      194.127.190.2.500           1        1          110          110   CON          0 0
   16:50:27.468929       F     esp      217.115.67.22          <->      194.127.190.2.20119*     1223       20       158834        16000   CON   23920001 0
   16:50:27.601715       F     esp      194.127.190.2           ->      217.115.67.22.40081*     4770        0      3756048            0   INT    5410065 0
   16:50:29.338213             udp      217.115.67.22.500      <->      194.127.190.2.500           1        1          110          110   CON          0 0
   16:50:32.469992       F     esp      217.115.67.22          <->      194.127.190.2.20119*     2349       42       288150        33600   CON   50176290 0
   16:50:32.602202       F     esp      194.127.190.2           ->      217.115.67.22.40081*     8576        0      6765172            0   INT    9145791 0
   16:50:34.399414             udp      217.115.67.22.500      <->      194.127.190.2.500           1        1          110          110   CON          0 0
   16:50:37.472861       F     esp      217.115.67.22          <->      194.127.190.2.20119*      513       18        64342        14400   CON   11698440 0
   16:50:37.603317       F     esp      194.127.190.2           ->      217.115.67.22.40081*     1551        0      1217694            0   INT    2259263 0
   16:50:39.449373             udp      217.115.67.22.500      <-       194.127.190.2.500           0        1            0          110   RSP          0 0

What is the significance of "20119*" and "40081*"?  I think that was
discussed on the list a while ago, but I can't find it anymore.  The
relation to the SPIs (0x17e3e6f7 and 0x0bfe0e52) isn't obvious to me.

Note the completely bogus sloss values.

"racluster -n -s +sloss +dloss -r trace.argus" gives:

   16:49:21.850751       F     esp      217.115.67.22          <->      194.127.190.2.20119*    21668     2836      2912456      2268544   CON  257404441 0
   16:49:21.853440             esp      194.127.190.2           ->      217.115.67.22.40081*    76255        0     58849978            0   INT   67834978 0
   16:49:22.858836             udp      217.115.67.22.500      <->      194.127.190.2.500          11       16         1210         1760   CON          0 0

This should be one ESP and one UDP flow, I think.  But the behaviour is
probably consistent with what can be expected from the output of ra().

Sorry, I haven't had time to check out rc.40 yet.  Hopefully tomorrow
evening.

There's one question I have, though.  As you can see from the F flag,
these ESP flows have fragments.  Lots of fragments, actually.  They are
from 1480 byte TCP segments being fragmented on the tunnel entry point.
What I am currently interested in is trying to figure out how I can use
Argus (or a different tool) to monitor the packet loss rate of VPN
tunnels and TCP connections in general. So, assuming I have identified a
number of argus records with non-empty loss fields (perhaps by splitting
them into time buckets with rabins()), is there a way to sort of "drill down"
and get a view of the flows at the IP level?

The idea behind this is, that I think it is easier to get a grip on badly
performing flows with a tool like argus than with collecting performance
and packet drop data from really large numbers of routers and switches.
Many of them will be beyond the local organization's border routers and
hence not pollable!  And still I'd want to know if my traffic flows going
off-site are behaving normally or badly because I might have SLAs
involving them.

I guess your suggestion earlier this year to experiment with monitoring
the performance of traffic flow for on-line games goes in the same direction.

Maybe this should be moved to a separate thread.

--chris



More information about the argus mailing list