racluster and trans

Carter Bullard carter at qosient.com
Wed Jul 21 12:04:03 EDT 2010


Hey Rafael,
When you think you have a bug, if you can send an argus datafile that demonstrates
the problem, I can probably determine if it really is a bug and fix it in a short period
of time.

OK, A few things.  All of the ra* aggregators have a mechanism to "correct" the
direction of a particular flow record.  Because you are tracking a direction dependent
attribute, "saddr", you may be seeing the results of racluster() "correcting" a records
direction.  If you don't want this type of correction, you need to specify that in a
racluster.conf file, but generally, correcting for direction is a very good thing.

See ./support/Config/racluster.conf and check out the racluster.1 manpage.
You will want to set this variable.

   RACLUSTER_AUTO_CORRECTION=no

Now that doesn't mean there isn't a bug, just means that we have to account for that
possibility.  Look at the actual records that racluster() generates, to see if you
understand that output, and if there are still problems, then send email.


OK, with respect to your racluster->rabins inconsistency.  What number do you think
you are generating?  Number of unique flows per srcid per 5 seconds?  You will need
to change the call to rabins() in order to get that number.

All ra* aggregators insert into the records an ARGUS_AGR_DSR information
element into the records.  That is the structure that contains the 'N' (trans), 'mean',
'max', 'min', 'stddev' metrics for the aggregation.  When you run aggregation twice, the
next aggregator simply adds to any existing "agr" dsr.  This is important for lots of
reasons, but creates errors for some analytics.

Rabins() while it is an aggregator, it also chops argus records along time lines.  In your
case, if a record spans a 5 second time boundary, rabins() will cut the argus record into
two records, and it will distribute the metrics, as it can.  For packet counts, byte counts,
it is easy, it distributes the values based on the duration of the record.  But there are no
rules for how to distribute the values in the ARGUS_AGR_DSR.  What currently happens
is we copy the AGR, unmodified, into both records.  Based on the type of statistic, this
is the right thing to do in many cases.  However in your case, where you are counting,
you will get over counting, due to the duplication of numbers for some records.

What I can do, is to modify the 'N' of the ARGUS_AGR_DSR statistic, to distribute the
number of samples for the statistic.  This may fix the inconsistency, and still preserve
the value of the statistic.  However, that will not generate the statistic you are actually
interested in.

In your example, looks like you want to count the number of unique flows per srcid every
5 seconds?  You need to remove the "agr" dsr for the input data of your call to rabins().

   rabins -M dsrs="-agr"  -m srcid -M hard time 5s -r test.cluster -s stime trans

That should get you the metric you're after.

Carter



On Jul 21, 2010, at 11:04 AM, Rafael Barbosa wrote:

>  Hi,
> 
> I have been having some problem with inconsistent ouptut from ragraph ploting Trans. I get different graphs comparing the results from "original" from the ones reduced with racluster.
> 
> I dug a bit and a found this old bug that might be related(http://thread.gmane.org/gmane.network.argus/6686/focus=6741):
> 
> Second, it seems racluster isn't adding up the trans field correctly, here is an example
> 
> ra -r file.argus -s saddr trans
>       27.8.77.166      1
>       27.8.77.166      1
>       18.9.27.219      1
>       18.9.27.219      1
>      18.86.96.147      1
>      18.86.96.147      1
>     19.32.203.136      1
>     19.32.203.136      1
> 
> racluster -r file.argus -m saddr -s saddr trans
>     19.32.203.136      4
>      18.86.96.147      3
>       18.9.27.219      4
>       27.8.77.166      3
> 
> This is what I get when I run something similar in one of my files:
> 
> ra -r file.argus -s saddr trans | sort
>         10.16.4.11      1
>         10.16.4.12      1
>         10.16.4.21      1
>         10.16.4.21      1
>         10.16.4.21      1
>         10.16.4.21      1
>         10.16.4.21      1
>         10.16.4.21      1
>         10.16.4.21      1
>         10.16.4.21      1
>         10.16.4.21      1
>         10.16.4.21      1
>         10.16.4.21      1
>         10.16.4.21      1
>         10.16.4.21      1
>         10.16.4.22      1
>         10.16.4.53      1
>         10.16.4.53      1
>         10.16.4.54      1
>         10.16.4.54      1
>         10.16.4.55      1
>         10.16.4.71      1
>         10.16.4.71      1
>        10.16.5.249      1
> racluster -r file.argus -m saddr -s saddr trans | sort
>         10.16.4.11      1
>         10.16.4.12      1
>         10.16.4.21     13
>         10.16.4.22      1
>         10.16.4.53      1
>         10.16.4.54      2
>         10.16.4.55      1
>         10.16.4.71      2
>        10.16.5.249      1
> 
> The count for 10.16.4.53 should be 2. I think there is a bug in racluster when calculating trans. Here is another weird result:
> ra -r big.file -N 100 -w test
> racluster -r test -w test.cluster
> rabins -m srcid -M hard time 5s -r test -s stime trans
>    14:37:15.000000     62
>    14:37:20.000000     72
>    14:37:25.000000     19
> rabins -m srcid -M hard time 5s -r test.cluster -s stime trans
>    14:37:15.000000     81
>    14:37:20.000000     76
>    14:37:25.000000     36
> 
> I get the same result if I use rasplit and later on racluster, instead of rabins.
> 
> Thanks,
> Rafael



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20100721/3ad3216f/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3815 bytes
Desc: not available
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20100721/3ad3216f/attachment.bin>


More information about the argus mailing list