Argus Flow Timeout Issues

Thu Nov 15 09:31:56 EST 2001

Hey Wozz,
This is a good time to review what Argus is doing for
timeouts.  I haven't looked at this in a long time, 
and if we find some issues and we can come to some
decisions, I'll tune the constants so we can get some
better behavior.

First a little history.  Argus-1.5 had multiple
timeout queues, about 5 in all.  These were used by
different protocols, because each protocol has its own
timeout domains.  All IP fragments of a single datagram
should come flying by in less than a few millisecond, an
ICMP ping response should show up in less than a second,
DNS responses have timeout responses in the 60 second
range, but a TCP connection could be idle for hours,
and still be connected ready to pass traffic.

In Argus-2.0 we have only one queue, and information in
each flow control block to do the various timeouts that
we need.  This allows us to have as much flexibility as
possible with regard to how flows are reported and timed
out.

There are two basic timers on each flow in Argus-2.0.
The first is the status interval reporting timer for existing
flows.  A flow has been active for a while, ..., we should
generate a status report.  Don't want to wait forever to
find out that a flow existed in the Argus.  The Far Status
Timer is a global value and is configured on the command line
It defaults to 60 seconds, so you will by default get a status
report on long lived flows every 60 seconds.  I like to use
values like 1 - 5 seconds in small workgroup networks and
15 - 30 in very large networks.  Argus supports Far Status
Timers of less than a second, and I've used values like 0.001
to do TCP bursting analysis on streaming media packet data.

The second timeout is the flow idle timeout.  We set this
timeout value at the creation of the flow based on protocol
type.  Because there is an independent value for each
flow, we can have different values and we can change
the timeout during the life of the flow.  Argus uses all
of these strategies.  These are the idle timeout values
in argus-2.0.4:

          IP fragments -   5 seconds

            IGMP flows - 300 seconds
             ARP flows - 300 seconds
      Unknown protocol - 300 seconds

     Initial TCP flows -  15 seconds
     Initial UDP flows -  15 seconds
     Initial ESP flows -  15 seconds
    Initial ICMP flows -  15 seconds

 All established flows - 300 seconds

            TCP closed -  10 seconds

The logic behind this is that for protocols that we have
some sense of the state and the behavior, we should deal
with them in an efficient manner, so short idle timers.
For flows that we expect to be persistent, but sporadic,
we should have long idle timeouts.

Now the above values may not be the most efficient,
so lets discuss.

Carter

Carter Bullard
QoSient, LLC
300 E. 56th Street, Suite 18K
New York, New York  10022

carter at qosient.com
Phone +1 212 588-9133
Fax   +1 212 588-9134
http://qosient.com

> -----Original Message-----
> From: Wozz [mailto:wozz+argus at wookie.net] 
> Sent: Wednesday, November 14, 2001 8:57 PM
> To: Carter Bullard
> Cc: argus-info at lists.andrew.cmu.edu
> Subject: Re: Ok, really, the last one.....ragator question
> 
> 
> On Wed, Nov 14, 2001 at 07:22:32PM -0500, Carter Bullard wrote:
> > 
> > Hey Wozz,
> >    The issue is the '?' in the direction indicator.
> > Because argus is absolutely stateful, and it is saying
> > that it doesn't know precisely who the source or the destination is 
> > (because it didn't see a SYN or a SYN_ACK before it saw the FIN).  
> > With the '?' its saying that the src and dst assignments may not be 
> > reliable.
> > 
> >    What probably happened is that argus timed the original 
> flow out, 
> > before the stray FIN came in, and because there is no flow 
> cache, it 
> > treats the lone FIN without any context.
> > 
> >    There is a solution.  First pass your traffic through 
> ragator with 
> > no configuration.  It will correct the '?'. If there was a 
> flow that 
> > this FIN belongs to, and it gets loaded into ragator before the FIN 
> > record is loaded, then it will discover the correct direction and 
> > merge the FIN report into the parent flow.
> > 
> 
> Ah ha!
> 
> I get it now.  It appears to work correctly now without all the extra
> flows.   What determines when a flow gets timed out?  Is that 
> in argus or
> in my ragator config?  The command line I'm using now is:
> 
> ragator -r * -w - host a.b.c.d |ragator -w - -f fmodel.conf 
> -r - | rasort -s startime -n -r -
> 
> Is there any way to shorten this?  Is there some way to have 
> make ragator do its default aggregation first, then the 
> defined flows, without running ragator twice?
> 
> 
>