Argus flow aggregation options

Carter Bullard via Argus-info argus-info at lists.andrew.cmu.edu
Fri Jan 8 11:20:36 EST 2016



Hey Fariba,
Look at rabins.1  It is designed to provide time, size and count 'bins' for record aggregation.
Checkout the man page and if you have any questions, send more email.

rabins.1 is a generic argus data aggregator, that is scoped in time, or file size or record count.
You specify, in your example, that you want to generate 10s bins:

   rabins -M time 10s

rabins.1 will input flow data, and aggregate the records into a binned cache.  At the end of the scope criteria (time, input bytes or record count), rabins.1 will output its scope flow cache.
Default aggregation rules will generate the type of records you are interested in.

To generate uni-directional records and process uni-directional flow records, you will want to add the “-M uni” command line directive, like all ra* programs.
 

rabins.1 takes in flow records and snips them to fit in the bins.  All metrics and objects have splitting methods.  For metrics, they are generally distributed into the bins.  So if you have a flow record with a duration of 1h, and you split it into 1s bins, you will get potentially 3600 records with all the metrics uniformly distributed into the new records.  There are exceptions to this, but when you think about it, you’ll get the idea.  If a record has packet counts, then the resulting record will be outputted (see below).  A lot of work went into the algorithms, so it should generate expected results.  If you see something odd, send us email.

If you are binning realtime streaming data (from a live feed), the thing to watch out for is the “-B <secs>” option, which you use to define a time buffer.  If your argus data source is configured to generate 5 second flow status records (which we recommend), then the -B option should be at least 5s.  If you’re collecting from a bunch of argi, and their clocks are off a bit, you will want to increase your time buffer to make sure all the data shows up before you close and write out the bin.  So, you can appreciate the need to synchronize your sensors.  Radium can correct for sensor time drift, if you have serious problems.

rabins.1 has a bunch of options, of particular interest are ‘zero’ and ‘hard’.

Normally, if rabins.1 doesn’t have any data to report for a bin, it doesn’t output anything.
With the ‘ -M zero ‘ mode, rabins.1 will generate an empty argus record for the time bin, which is important for graphing, etc…  So in the example above, where we generate 3600 records for a 1 hour flow that is processed into 1s bins, with the “-M zero” option you should get 3600 records.

When rabins.1 reports on a time bin, the flow record timestamps all will be contained within the bin time range.   For some applications, you may want the timestamps to reflect the bin time boundaries.
The “-M hard” mode, with change all the outputted timestamps to be the bin time boundary, which can be statistically important.

OK, so give rabins.1 a try, and if you have any issues, send more email,

Carter



On Jan 8, 2016, at 9:41 AM, Fariba Haddadi via Argus-info <argus-info at lists.andrew.cmu.edu <mailto:argus-info at lists.andrew.cmu.edu>> wrote:

> Hi All,
>  
> I was wondering if Argus has an option to aggregate the flows within a time window (time interval). I mean, for example in every 10 sec, instead of having multiple flows with the same “SrcIP, SrcPort, DstIP, DstPort, proto” combination, only one aggregated flow is exported. If there is such an option, can you tell me how I can use it in uni-direction and bi-directional  mode?
>  
>  
> Thanks,
> -Fariba

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20160108/a271fbe4/attachment.html>


More information about the argus mailing list