appbyte ratio

Carter Bullard carter at qosient.com
Fri May 3 14:39:45 EDT 2013


Hey John,
So when dealing with the ratio ( [s | d]appbytes / [s | d]bytes) we do end
up with some issues we have to deal with.  May not seem intuitive, but we
will have conditions where we end up with ( 0 / X ) and ( 0 / 0 ) as the actual
values for the metric, and ( 0 / X ) is a completely different state than ( 0 / 0 ).
While every flow record has to have at least some bytes in it, we can
easily have ( bytes == 0 ) in one of the directions.   So it is a condition
we need to convey.  We can return -1 for ( 0 / 0 ) to discriminate that
condition?

In dealing with all the zero's that we may get in this new metric, a few
situations shouldn't exist.  At least we know that when the denominator
of ( appbytes / bytes ) is zero, the numerator had better also be zero,
or something is definitely wrong ;O)

Carter

On May 2, 2013, at 1:14 AM, John Gerth <gerth at graphics.stanford.edu> wrote:

> I'm a big fan of the appbyte metric and have created and used their ratio in the past.
> 
> One interesting question that comes up is what to do with the 0's. It's important because
> knowing that one or both sides didn't send any payload can be significant (not to
> mention what to do when 0 is in the denominator).
> 
> /J
> 
> --
> John Gerth      gerth at graphics.stanford.edu  Gates 378   (650) 725-3273
> 
> On 5/1/13 5:23 AM, Carter Bullard wrote:
>> Hey Jesse,
>> How about we make a new field;  " [ s | d ]abr " for the [ src or dst ] appbyte ratio ?  I'll do that today.
>> 
>> Not sure what is happening with the multiple addresses showing up. That would seem to be a bug.  Can you share some data so I can try to recreate the
>> problem ?
>> 
>> Carter
>> 
>> On Apr 30, 2013, at 10:44 PM, Jesse Bowling <jessebowling at gmail.com <mailto:jessebowling at gmail.com>> wrote:
>> 
>>> Hi Carter,
>>> 
>>> I've been working through this example; this is a very interesting approach in that you're boiling host network patterns into a single number that
>>> you can watch over time to indicate a change in the host...This sort of distillation seems like a big win, once you're instrumented to track it! ...
>>> 
>>> On that subject, I had some difficulties while trying to blindly implement the commands you gave and wanted to send back some notes and questions to
>>> the list...
>>> 
>>> * The text states you need "-M rmon" in the first racluster, but the example doesn't include it; I found it should be:
>>> 
>>> racluster -R argus_dir/ -M rmon -m saddr proto sport -w argus.out - 'ipv4'
>>> 
>>> * I found I could calculate the ratio of sappbytes/dappbytes (and create a 'label') using awk like:
>>> 
>>> awk '{if( $8 + 0 != 0) {LABEL="Balanced";RATIO=$7/$8; if ( RATIO > 1.5) {LABEL="Producer"}; if (RATIO < 0.95) {LABEL="Consumer"}; print
>>> $0,RATIO"\t"LABEL}}' ra_text_output_file
>>> 
>>> However my example is based on the fields in my rarc file, and thus this method isn't very elegant...and will also miss any records that are missing
>>> a field...It would seem that this metric would be easy to calculate with the clients themselves and would give the added benefit of allowing for
>>> ralabel'ing to be used on the metric (much more portable and useful I think)...I think this is a feature request... :)
>>> 
>>> * I wanted to start iterating through various test cases on my data, varying time ranges and networks that I examined. I found that I can get very
>>> 'off' results based on how I try to filter which networks I want...for instance:
>>> 
>>> This example will lead to hosts showing up multiple times in the final output
>>> # /usr/local/bin/racluster -r ${HOUR}* -M rmon -m saddr proto sport -w ${TMP1} - 'ipv4 and *src net 10.10.10.0/24 <http://10.10.10.0/24>*'
>>> #/usr/local/bin/racluster -r ${TMP1} -m saddr -w - | /usr/local/bin/rasort -r - -m sappbytes -s stime dur saddr proto sport sappbytes dappbytes
>>> 
>>> This example will appears to be fine in the final output
>>> # /usr/local/bin/racluster -r ${HOUR}* -M rmon -m saddr proto sport -w ${TMP1} - 'ipv4 and *net 10.10.10.0/24 <http://10.10.10.0/24>*'
>>> #/usr/local/bin/racluster -r ${TMP1} -m saddr -w - | /usr/local/bin/rasort -r - -m sappbytes -s stime dur saddr proto sport sappbytes dappbytes
>>> 
>>> I think I have a misunderstanding about how racluster and filters interact; can you explain why the 'src' part in the first example would cause
>>> multiple entries for individual hosts in the final output?
>>> 
>>> Thank you for sharing your knowledge and experience to this community!
>>> 
>>> Cheers,
>>> 
>>> Jesse
>>> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 6837 bytes
Desc: not available
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20130503/69c852e4/attachment.bin>


More information about the argus mailing list