appbyte ratio
Carter Bullard
carter at qosient.com
Fri May 3 14:39:45 EDT 2013
Hey John,
So when dealing with the ratio ( [s | d]appbytes / [s | d]bytes) we do end
up with some issues we have to deal with. May not seem intuitive, but we
will have conditions where we end up with ( 0 / X ) and ( 0 / 0 ) as the actual
values for the metric, and ( 0 / X ) is a completely different state than ( 0 / 0 ).
While every flow record has to have at least some bytes in it, we can
easily have ( bytes == 0 ) in one of the directions. So it is a condition
we need to convey. We can return -1 for ( 0 / 0 ) to discriminate that
condition?
In dealing with all the zero's that we may get in this new metric, a few
situations shouldn't exist. At least we know that when the denominator
of ( appbytes / bytes ) is zero, the numerator had better also be zero,
or something is definitely wrong ;O)
Carter
On May 2, 2013, at 1:14 AM, John Gerth <gerth at graphics.stanford.edu> wrote:
> I'm a big fan of the appbyte metric and have created and used their ratio in the past.
>
> One interesting question that comes up is what to do with the 0's. It's important because
> knowing that one or both sides didn't send any payload can be significant (not to
> mention what to do when 0 is in the denominator).
>
> /J
>
> --
> John Gerth gerth at graphics.stanford.edu Gates 378 (650) 725-3273
>
> On 5/1/13 5:23 AM, Carter Bullard wrote:
>> Hey Jesse,
>> How about we make a new field; " [ s | d ]abr " for the [ src or dst ] appbyte ratio ? I'll do that today.
>>
>> Not sure what is happening with the multiple addresses showing up. That would seem to be a bug. Can you share some data so I can try to recreate the
>> problem ?
>>
>> Carter
>>
>> On Apr 30, 2013, at 10:44 PM, Jesse Bowling <jessebowling at gmail.com <mailto:jessebowling at gmail.com>> wrote:
>>
>>> Hi Carter,
>>>
>>> I've been working through this example; this is a very interesting approach in that you're boiling host network patterns into a single number that
>>> you can watch over time to indicate a change in the host...This sort of distillation seems like a big win, once you're instrumented to track it! ...
>>>
>>> On that subject, I had some difficulties while trying to blindly implement the commands you gave and wanted to send back some notes and questions to
>>> the list...
>>>
>>> * The text states you need "-M rmon" in the first racluster, but the example doesn't include it; I found it should be:
>>>
>>> racluster -R argus_dir/ -M rmon -m saddr proto sport -w argus.out - 'ipv4'
>>>
>>> * I found I could calculate the ratio of sappbytes/dappbytes (and create a 'label') using awk like:
>>>
>>> awk '{if( $8 + 0 != 0) {LABEL="Balanced";RATIO=$7/$8; if ( RATIO > 1.5) {LABEL="Producer"}; if (RATIO < 0.95) {LABEL="Consumer"}; print
>>> $0,RATIO"\t"LABEL}}' ra_text_output_file
>>>
>>> However my example is based on the fields in my rarc file, and thus this method isn't very elegant...and will also miss any records that are missing
>>> a field...It would seem that this metric would be easy to calculate with the clients themselves and would give the added benefit of allowing for
>>> ralabel'ing to be used on the metric (much more portable and useful I think)...I think this is a feature request... :)
>>>
>>> * I wanted to start iterating through various test cases on my data, varying time ranges and networks that I examined. I found that I can get very
>>> 'off' results based on how I try to filter which networks I want...for instance:
>>>
>>> This example will lead to hosts showing up multiple times in the final output
>>> # /usr/local/bin/racluster -r ${HOUR}* -M rmon -m saddr proto sport -w ${TMP1} - 'ipv4 and *src net 10.10.10.0/24 <http://10.10.10.0/24>*'
>>> #/usr/local/bin/racluster -r ${TMP1} -m saddr -w - | /usr/local/bin/rasort -r - -m sappbytes -s stime dur saddr proto sport sappbytes dappbytes
>>>
>>> This example will appears to be fine in the final output
>>> # /usr/local/bin/racluster -r ${HOUR}* -M rmon -m saddr proto sport -w ${TMP1} - 'ipv4 and *net 10.10.10.0/24 <http://10.10.10.0/24>*'
>>> #/usr/local/bin/racluster -r ${TMP1} -m saddr -w - | /usr/local/bin/rasort -r - -m sappbytes -s stime dur saddr proto sport sappbytes dappbytes
>>>
>>> I think I have a misunderstanding about how racluster and filters interact; can you explain why the 'src' part in the first example would cause
>>> multiple entries for individual hosts in the final output?
>>>
>>> Thank you for sharing your knowledge and experience to this community!
>>>
>>> Cheers,
>>>
>>> Jesse
>>>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 6837 bytes
Desc: not available
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20130503/69c852e4/attachment.bin>
More information about the argus
mailing list