racluster and trans

Rafael Barbosa rrbarbosa at gmail.com
Fri Aug 20 09:55:57 EDT 2010


Ok, actually I have changed that in the version of ragraph I use some time
before, to plot total transactions instead of average.

Thanks,
Rafael

On Fri, Aug 20, 2010 at 3:40 PM, Carter Bullard <carter at qosient.com> wrote:

> Hey Rafael,
> I should change the scale of the graph.
>
> Carter
>
> On Aug 20, 2010, at 9:35 AM, Rafael Barbosa wrote:
>
> Hi Carter,
>
> Maybe I misunderstood what you meant, but I replicated the test:
>
> racluster -r test -w test.cluster
> ragraph trans -M 5s -r test.cluster -w test.cluster.png
> ragraph trans -M 5s -r test -w test.png
>
> with versions 3.0.2 (with ragraph altered to plot Concurrent Transactions
> instead of average) and 3.0.3.17 and I see the same results. Did your
> changes intended to change the results of this test?
>
> Rafael
>
> On Mon, Jul 26, 2010 at 10:35 PM, Carter Bullard <carter at qosient.com>wrote:
>
>> Hey Rafael,
>> I found a number of  bugs as a result of your report, and have fixed all
>> that I found.
>> These involved rabins() and ragraph(), and so these fixes will affect your
>> effort.
>>
>> I modified how ragraph() processes "trans".  We will now graph the actual
>> number of Concurrent Transactions per time period, rather than the AVERAGE
>> of the trans value.  We were graphing connections per second, rather than
>> the
>> total number of transactions during a time interval.  This seems more
>> appropriate,
>> and if there is a problem, please send email.
>>
>> Please try the latest argus-clients on the server.
>>    http://qosient.com/argus/dev/argus-clients-3.0.3.17.tar.gz
>>
>> This will fix processing files that have been previously aggregated, but
>> you should
>> use the "-M dsrs='-agr'" option when the files have been pre-processed as
>> a general
>> rule.
>>
>> Thanks!!!!
>> Carter
>>
>> On Jul 22, 2010, at 4:58 AM, Rafael Barbosa wrote:
>>
>> Hi,
>>
>> In the mailing list I found a similar problem to what I was observing, but
>> now I see that after "correcting" the flow direction, the test I reported
>> does not make sense. Indeed one of the 'saddr' becomes a 'daddr', so no
>> problem there.
>>
>> After your explanation regarding the aggregation metrics (N, mean, etc) I
>> understand the problem with aggregating/spliting flows. I will try to take
>> it into account when getting some statistics from the files. Regarding your
>> question:
>>
>> In your example, looks like you want to count the number of unique flows
>>> per srcid every
>>
>> 5 seconds?
>>
>>
>> The test I did in reality was:
>> racluster -r test -w test.cluster
>> ragraph trans -M 5s -r test.cluster -w test.cluster.png
>> ragraph trans -M 5s -r test -w test.png
>>
>> Comparing the graphs, I see completely different results, so I tried to
>> reproduce the results using rabins (it's easier to send its output to this
>> list): report the number of flows per 5s bin.
>>
>> The proposed solution still does not reproduce the original results:
>>
>> rabins -M dsrs="-agr" -m srcid -M hard time 5s -r test/test.cluster -s
>> stime trans
>>    14:37:15.000000     62
>>    14:37:20.000000     57
>>    14:37:25.000000     19
>>
>> For now I will avoid rabins/racluster in files already aggregated.
>>
>> Rafael
>>
>> On Wed, Jul 21, 2010 at 6:04 PM, Carter Bullard <carter at qosient.com>wrote:
>>
>>> Hey Rafael,
>>> When you think you have a bug, if you can send an argus datafile that
>>> demonstrates
>>> the problem, I can probably determine if it really is a bug and fix it in
>>> a short period
>>> of time.
>>>
>>> OK, A few things.  All of the ra* aggregators have a mechanism to
>>> "correct" the
>>> direction of a particular flow record.  Because you are tracking a
>>> direction dependent
>>> attribute, "saddr", you may be seeing the results of
>>> racluster() "correcting" a records
>>> direction.  If you don't want this type of correction, you need to
>>> specify that in a
>>> racluster.conf file, but generally, correcting for direction is a very
>>> good thing.
>>>
>>> See ./support/Config/racluster.conf and check out the racluster.1
>>> manpage.
>>> You will want to set this variable.
>>>
>>>    RACLUSTER_AUTO_CORRECTION=no
>>>
>>> Now that doesn't mean there isn't a bug, just means that we have to
>>> account for that
>>> possibility.  Look at the actual records that racluster() generates, to
>>> see if you
>>> understand that output, and if there are still problems, then send email.
>>>
>>>
>>> OK, with respect to your racluster->rabins inconsistency.  What number do
>>> you think
>>> you are generating?  Number of unique flows per srcid per 5 seconds?  You
>>> will need
>>> to change the call to rabins() in order to get that number.
>>>
>>> All ra* aggregators insert into the records an ARGUS_AGR_DSR information
>>> element into the records.  That is the structure that contains the 'N'
>>> (trans), 'mean',
>>> 'max', 'min', 'stddev' metrics for the aggregation.  When you run
>>> aggregation twice, the
>>> next aggregator simply adds to any existing "agr" dsr.  This is important
>>> for lots of
>>> reasons, but creates errors for some analytics.
>>>
>>> Rabins() while it is an aggregator, it also chops argus records along
>>> time lines.  In your
>>> case, if a record spans a 5 second time boundary, rabins() will cut the
>>> argus record into
>>> two records, and it will distribute the metrics, as it can.  For packet
>>> counts, byte counts,
>>> it is easy, it distributes the values based on the duration of the
>>> record.  But there are no
>>> rules for how to distribute the values in the ARGUS_AGR_DSR.  What
>>> currently happens
>>> is we copy the AGR, unmodified, into both records.  Based on the type of
>>> statistic, this
>>> is the right thing to do in many cases.  However in your case, where you
>>> are counting,
>>> you will get over counting, due to the duplication of numbers for some
>>> records.
>>>
>>> What I can do, is to modify the 'N' of the ARGUS_AGR_DSR statistic, to
>>> distribute the
>>> number of samples for the statistic.  This may fix the inconsistency, and
>>> still preserve
>>> the value of the statistic.  However, that will not generate the
>>> statistic you are actually
>>> interested in.
>>>
>>> In your example, looks like you want to count the number of unique flows
>>> per srcid every
>>> 5 seconds?  You need to remove the "agr" dsr for the input data of your
>>> call to rabins().
>>>
>>>    rabins -M dsrs="-agr"  -m srcid -M hard time 5s -r test.cluster -s
>>> stime trans
>>>
>>> That should get you the metric you're after.
>>>
>>> Carter
>>>
>>>
>>>
>>> On Jul 21, 2010, at 11:04 AM, Rafael Barbosa wrote:
>>>
>>>  Hi,
>>>
>>> I have been having some problem with inconsistent ouptut from ragraph
>>> ploting Trans. I get different graphs comparing the results from "original"
>>> from the ones reduced with racluster.
>>>
>>> I dug a bit and a found this old bug that might be related(
>>> http://thread.gmane.org/gmane.network.argus/6686/focus=6741):
>>>
>>> Second, it seems racluster isn't adding up the trans field correctly,
>>>> here is an example
>>>
>>>
>>>> ra -r file.argus -s saddr trans
>>>
>>>       27.8.77.166      1
>>>
>>>       27.8.77.166      1
>>>
>>>       18.9.27.219      1
>>>
>>>       18.9.27.219      1
>>>
>>>      18.86.96.147      1
>>>
>>>      18.86.96.147      1
>>>
>>>     19.32.203.136      1
>>>
>>>     19.32.203.136      1
>>>
>>>
>>>> racluster -r file.argus -m saddr -s saddr trans
>>>
>>>     19.32.203.136      4
>>>
>>>      18.86.96.147      3
>>>
>>>       18.9.27.219      4
>>>
>>>       27.8.77.166      3
>>>
>>>
>>> This is what I get when I run something similar in one of my files:
>>>
>>> ra -r file.argus -s saddr trans | sort
>>>         10.16.4.11      1
>>>         10.16.4.12      1
>>>         10.16.4.21      1
>>>         10.16.4.21      1
>>>         10.16.4.21      1
>>>         10.16.4.21      1
>>>         10.16.4.21      1
>>>         10.16.4.21      1
>>>         10.16.4.21      1
>>>         10.16.4.21      1
>>>         10.16.4.21      1
>>>         10.16.4.21      1
>>>         10.16.4.21      1
>>>         10.16.4.21      1
>>>         10.16.4.21      1
>>>         10.16.4.22      1
>>>         10.16.4.53      1
>>>         10.16.4.53      1
>>>         10.16.4.54      1
>>>         10.16.4.54      1
>>>         10.16.4.55      1
>>>         10.16.4.71      1
>>>         10.16.4.71      1
>>>        10.16.5.249      1
>>> racluster -r file.argus -m saddr -s saddr trans | sort
>>>         10.16.4.11      1
>>>         10.16.4.12      1
>>>         10.16.4.21     13
>>>         10.16.4.22      1
>>>         10.16.4.53      1
>>>         10.16.4.54      2
>>>         10.16.4.55      1
>>>         10.16.4.71      2
>>>        10.16.5.249      1
>>>
>>> The count for 10.16.4.53 should be 2. I think there is a bug in racluster
>>> when calculating trans. Here is another weird result:
>>> ra -r big.file -N 100 -w test
>>> racluster -r test -w test.cluster
>>> rabins -m srcid -M hard time 5s -r test -s stime trans
>>>    14:37:15.000000     62
>>>    14:37:20.000000     72
>>>    14:37:25.000000     19
>>> rabins -m srcid -M hard time 5s -r test.cluster -s stime trans
>>>    14:37:15.000000     81
>>>    14:37:20.000000     76
>>>    14:37:25.000000     36
>>>
>>> I get the same result if I use rasplit and later on racluster, instead of
>>> rabins.
>>>
>>> Thanks,
>>> Rafael
>>>
>>>
>>>
>>>
>> <test>
>>
>>
>>  Carter Bullard
>> CEO/President
>> QoSient, LLC
>> 150 E 57th Street Suite 12D
>> New York, New York  10022
>>
>> +1 212 588-9133 Phone
>> +1 212 588-9134 Fax
>>
>>
>>
>>
>
>  Carter Bullard
> CEO/President
> QoSient, LLC
> 150 E 57th Street Suite 12D
> New York, New York  10022
>
> +1 212 588-9133 Phone
> +1 212 588-9134 Fax
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20100820/ef25b9a3/attachment.html>


More information about the argus mailing list