another ra oddity...

Carter Bullard cbullard at
Wed Oct 6 12:59:19 EDT 1999

Hey Russell,
Yeah, it basically is aggregating all the Argus records,
and then, when reading is complete, it sorts the records based
on start time.

The sorting algorithm is an insertion sort, using a
bin hash with chains.  I create a bin for each second in the
range of time that spans the collected input, and then sort
by microsecond within that bin, using a straight insertion.

I would suspect that you just have too many output records
for your 32MB.  No real need to run raconnections() on the
entire daily log, since only a very small percentage of flows
will span hourly boundaries.  

You get the greatest amount of aggregation when you have a small
detail interval, say '-d 5', and a relatively large collection interval,
say with your hourly output files.  Since the mean duration of
argus IP flow output is usually under 10 seconds, you don't get much
straight reduction, except when the -d option is set.  In this
case, you'll reduce tcp reporting load to 25%, since there will
be 4 argus records, minimum, per TCP connection.  And to 50% for
UDP request/response traffic, say for DNS.

For those few services that do, X-Windows, SNMP associations,
Management pings, Telnet, etc.. you can pick these specific
records out of the file, by selectively aggregating just these
records by using filters on raconnections().


> -----Original Message-----
> From: Russell Fulton [mailto:r.fulton at]
> Sent: Wednesday, October 06, 1999 12:18 AM
> To: Bullard, Carter [NYPAR:DS46-I:EXCH]
> Subject: Re: RE: another ra oddity...
> >    Not much you can do here.  raconnections() will match them up
> > correctly when it merges them together.
> > 
> BTW I find that raconnections uses a lot of memory (this too is 
> unavoidable I guess).  I tried to process a whole days logs from 24 
> hourly files and it runs my 32MB linux box into the ground, conectins 
> got up to 20MB and everything just sat.
> Yes, I am using the raconnectins from the latest distribution.
> At the moment I just compress my hourly log files.  I'll try running 
> them through raconnections and piping the output to gzip.  The memory 
> usage for an hours data is managable (peaks at 10MB) and we get a 
> 15-20% size reduction...
> Cheers, Russell.

More information about the argus mailing list