racluster and trans
Carter Bullard
carter at qosient.com
Fri Aug 20 09:40:07 EDT 2010
Hey Rafael,
I should change the scale of the graph.
Carter
On Aug 20, 2010, at 9:35 AM, Rafael Barbosa wrote:
> Hi Carter,
>
> Maybe I misunderstood what you meant, but I replicated the test:
>
> racluster -r test -w test.cluster
> ragraph trans -M 5s -r test.cluster -w test.cluster.png
> ragraph trans -M 5s -r test -w test.png
>
> with versions 3.0.2 (with ragraph altered to plot Concurrent Transactions instead of average) and 3.0.3.17 and I see the same results. Did your changes intended to change the results of this test?
>
> Rafael
>
> On Mon, Jul 26, 2010 at 10:35 PM, Carter Bullard <carter at qosient.com> wrote:
> Hey Rafael,
> I found a number of bugs as a result of your report, and have fixed all that I found.
> These involved rabins() and ragraph(), and so these fixes will affect your effort.
>
> I modified how ragraph() processes "trans". We will now graph the actual
> number of Concurrent Transactions per time period, rather than the AVERAGE
> of the trans value. We were graphing connections per second, rather than the
> total number of transactions during a time interval. This seems more appropriate,
> and if there is a problem, please send email.
>
> Please try the latest argus-clients on the server.
> http://qosient.com/argus/dev/argus-clients-3.0.3.17.tar.gz
>
> This will fix processing files that have been previously aggregated, but you should
> use the "-M dsrs='-agr'" option when the files have been pre-processed as a general
> rule.
>
> Thanks!!!!
> Carter
>
> On Jul 22, 2010, at 4:58 AM, Rafael Barbosa wrote:
>
>> Hi,
>>
>> In the mailing list I found a similar problem to what I was observing, but now I see that after "correcting" the flow direction, the test I reported does not make sense. Indeed one of the 'saddr' becomes a 'daddr', so no problem there.
>>
>> After your explanation regarding the aggregation metrics (N, mean, etc) I understand the problem with aggregating/spliting flows. I will try to take it into account when getting some statistics from the files. Regarding your question:
>>
>> In your example, looks like you want to count the number of unique flows per srcid every
>> 5 seconds?
>>
>> The test I did in reality was:
>> racluster -r test -w test.cluster
>> ragraph trans -M 5s -r test.cluster -w test.cluster.png
>> ragraph trans -M 5s -r test -w test.png
>>
>> Comparing the graphs, I see completely different results, so I tried to reproduce the results using rabins (it's easier to send its output to this list): report the number of flows per 5s bin.
>>
>> The proposed solution still does not reproduce the original results:
>>
>> rabins -M dsrs="-agr" -m srcid -M hard time 5s -r test/test.cluster -s stime trans
>> 14:37:15.000000 62
>> 14:37:20.000000 57
>> 14:37:25.000000 19
>>
>> For now I will avoid rabins/racluster in files already aggregated.
>>
>> Rafael
>>
>> On Wed, Jul 21, 2010 at 6:04 PM, Carter Bullard <carter at qosient.com> wrote:
>> Hey Rafael,
>> When you think you have a bug, if you can send an argus datafile that demonstrates
>> the problem, I can probably determine if it really is a bug and fix it in a short period
>> of time.
>>
>> OK, A few things. All of the ra* aggregators have a mechanism to "correct" the
>> direction of a particular flow record. Because you are tracking a direction dependent
>> attribute, "saddr", you may be seeing the results of racluster() "correcting" a records
>> direction. If you don't want this type of correction, you need to specify that in a
>> racluster.conf file, but generally, correcting for direction is a very good thing.
>>
>> See ./support/Config/racluster.conf and check out the racluster.1 manpage.
>> You will want to set this variable.
>>
>> RACLUSTER_AUTO_CORRECTION=no
>>
>> Now that doesn't mean there isn't a bug, just means that we have to account for that
>> possibility. Look at the actual records that racluster() generates, to see if you
>> understand that output, and if there are still problems, then send email.
>>
>>
>> OK, with respect to your racluster->rabins inconsistency. What number do you think
>> you are generating? Number of unique flows per srcid per 5 seconds? You will need
>> to change the call to rabins() in order to get that number.
>>
>> All ra* aggregators insert into the records an ARGUS_AGR_DSR information
>> element into the records. That is the structure that contains the 'N' (trans), 'mean',
>> 'max', 'min', 'stddev' metrics for the aggregation. When you run aggregation twice, the
>> next aggregator simply adds to any existing "agr" dsr. This is important for lots of
>> reasons, but creates errors for some analytics.
>>
>> Rabins() while it is an aggregator, it also chops argus records along time lines. In your
>> case, if a record spans a 5 second time boundary, rabins() will cut the argus record into
>> two records, and it will distribute the metrics, as it can. For packet counts, byte counts,
>> it is easy, it distributes the values based on the duration of the record. But there are no
>> rules for how to distribute the values in the ARGUS_AGR_DSR. What currently happens
>> is we copy the AGR, unmodified, into both records. Based on the type of statistic, this
>> is the right thing to do in many cases. However in your case, where you are counting,
>> you will get over counting, due to the duplication of numbers for some records.
>>
>> What I can do, is to modify the 'N' of the ARGUS_AGR_DSR statistic, to distribute the
>> number of samples for the statistic. This may fix the inconsistency, and still preserve
>> the value of the statistic. However, that will not generate the statistic you are actually
>> interested in.
>>
>> In your example, looks like you want to count the number of unique flows per srcid every
>> 5 seconds? You need to remove the "agr" dsr for the input data of your call to rabins().
>>
>> rabins -M dsrs="-agr" -m srcid -M hard time 5s -r test.cluster -s stime trans
>>
>> That should get you the metric you're after.
>>
>> Carter
>>
>>
>>
>> On Jul 21, 2010, at 11:04 AM, Rafael Barbosa wrote:
>>
>>> Hi,
>>>
>>> I have been having some problem with inconsistent ouptut from ragraph ploting Trans. I get different graphs comparing the results from "original" from the ones reduced with racluster.
>>>
>>> I dug a bit and a found this old bug that might be related(http://thread.gmane.org/gmane.network.argus/6686/focus=6741):
>>>
>>> Second, it seems racluster isn't adding up the trans field correctly, here is an example
>>>
>>> ra -r file.argus -s saddr trans
>>> 27.8.77.166 1
>>> 27.8.77.166 1
>>> 18.9.27.219 1
>>> 18.9.27.219 1
>>> 18.86.96.147 1
>>> 18.86.96.147 1
>>> 19.32.203.136 1
>>> 19.32.203.136 1
>>>
>>> racluster -r file.argus -m saddr -s saddr trans
>>> 19.32.203.136 4
>>> 18.86.96.147 3
>>> 18.9.27.219 4
>>> 27.8.77.166 3
>>>
>>> This is what I get when I run something similar in one of my files:
>>>
>>> ra -r file.argus -s saddr trans | sort
>>> 10.16.4.11 1
>>> 10.16.4.12 1
>>> 10.16.4.21 1
>>> 10.16.4.21 1
>>> 10.16.4.21 1
>>> 10.16.4.21 1
>>> 10.16.4.21 1
>>> 10.16.4.21 1
>>> 10.16.4.21 1
>>> 10.16.4.21 1
>>> 10.16.4.21 1
>>> 10.16.4.21 1
>>> 10.16.4.21 1
>>> 10.16.4.21 1
>>> 10.16.4.21 1
>>> 10.16.4.22 1
>>> 10.16.4.53 1
>>> 10.16.4.53 1
>>> 10.16.4.54 1
>>> 10.16.4.54 1
>>> 10.16.4.55 1
>>> 10.16.4.71 1
>>> 10.16.4.71 1
>>> 10.16.5.249 1
>>> racluster -r file.argus -m saddr -s saddr trans | sort
>>> 10.16.4.11 1
>>> 10.16.4.12 1
>>> 10.16.4.21 13
>>> 10.16.4.22 1
>>> 10.16.4.53 1
>>> 10.16.4.54 2
>>> 10.16.4.55 1
>>> 10.16.4.71 2
>>> 10.16.5.249 1
>>>
>>> The count for 10.16.4.53 should be 2. I think there is a bug in racluster when calculating trans. Here is another weird result:
>>> ra -r big.file -N 100 -w test
>>> racluster -r test -w test.cluster
>>> rabins -m srcid -M hard time 5s -r test -s stime trans
>>> 14:37:15.000000 62
>>> 14:37:20.000000 72
>>> 14:37:25.000000 19
>>> rabins -m srcid -M hard time 5s -r test.cluster -s stime trans
>>> 14:37:15.000000 81
>>> 14:37:20.000000 76
>>> 14:37:25.000000 36
>>>
>>> I get the same result if I use rasplit and later on racluster, instead of rabins.
>>>
>>> Thanks,
>>> Rafael
>>
>>
>>
>>
>> <test>
>
>
> Carter Bullard
> CEO/President
> QoSient, LLC
> 150 E 57th Street Suite 12D
> New York, New York 10022
>
> +1 212 588-9133 Phone
> +1 212 588-9134 Fax
>
>
>
>
Carter Bullard
CEO/President
QoSient, LLC
150 E 57th Street Suite 12D
New York, New York 10022
+1 212 588-9133 Phone
+1 212 588-9134 Fax
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20100820/a646e14a/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3815 bytes
Desc: not available
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20100820/a646e14a/attachment.bin>
More information about the argus
mailing list