argus and 95th percentile

Carter Bullard carter at qosient.com
Thu Apr 2 22:00:46 EDT 2009


Hey Rodney,
If you want to do interface statistics, then ethernet addresses are an  
absolute
must for nailing down the direction of the packets. bytes, etc....

You are trying to calculate a Layer 2 statistic, so you need a Layer 2  
identifier.

If you have a simple IP address layout, you can generate the statistic  
by aggregating
at Layer 3, but to convert the numbers to interface stats, you have to  
decide
which "side" of the link you want the stats for, and remove the Layer  
3 address
oriented numbers that are on the "other side".

Lets assume that subnet 1.2.3.0/24 is your network and its on the  
"right side"
of the monitor, and all the rest of the world is on the "left side".
This incantation will work:

    ra -M rmon -r file -w - - ip  | rabins -r - -M hard time 5m -m  
srcid - src net 1.2.3.0/24

So what are we doing here.  We use "ra -M rmon" to remove the concept of
Src and Dst.  This causes all the IP addresses to end up in the  
"SrcAddr" field,
with all the metrics adjusted accordingly.  This causes all the pkts  
and bytes
to be evenly distributed between the "Src" and "Dst" metrics.

By filtering out all the IP addresses that relate to the activity on  
the "left side",
we end up with aggregate statistics for all the objects that are on  
the "right side",
so we end up with In and Out stats for the "right side" boundry.

The SrcLoad is where all the objects on the right were the Source, so  
this will
be traffic "Outof" your subnet, and the DstLoad, with be "Into" your  
subnet.

Might seem weird, but this is more of a relational algebraic thing  
than anything
else, so You can't really fight it ;o)

Carter


On Apr 2, 2009, at 6:41 AM, Rodney McKee wrote:

> Carter,
>
> Finally getting time to have a closer look over these great detailed  
> instructions.
> I have a feeling from what you are saying here and in other emails I  
> REALLY should be enabling ARGUS_GENERATE_MAC_DATA on all my  
> collectors as smac etc looks to be required in several occurrences  
> when using rmon.
> I'm guessing because I don't have the mac addr my stats are looking  
> pretty much the same for inbound and outbound load.
>
> $ rasort -r /tmp/data.out -m load -s stime dur saddr srcid smac  
> sload:16 dload:16
>       StartTime        Dur            SrcAddr               
> SrcId             SrcMac          SrcLoad          DstLoad
> 15:05:00.000000 300.000000            0.0.0.0           
> 127.0.0.1                     84344472.000000  84344480.000000
> 15:50:00.000000 300.000000            0.0.0.0           
> 127.0.0.1                     83736384.000000  83736384.000000
> 14:10:00.000000 300.000000            0.0.0.0           
> 127.0.0.1                     82034344.000000  82034344.000000
> 14:55:00.000000 300.000000            0.0.0.0           
> 127.0.0.1                     75430448.000000  75430440.000000
> 15:55:00.000000 300.000000            0.0.0.0           
> 127.0.0.1                     73696456.000000  73696456.000000
>
> I'm rather new to the study of flow statistics, do you have any  
> suggestions on further readings.
>
> You mentioned in an earlier email:
>
> So, do you want to do it for the whole link?  The totals seen for a  
> specific
> ethernet address every 5 minutes would be the best way to calculate  
> the metrics.
> Here is a run with a little data I just grabbed.
>
> In this instance I'm after outbound data for the entire link, I'm  
> simply running argus locally on our perimeter firewall and  
> collecting statistics on our external interface.
>
>
>
> ----- "Carter Bullard" <carter at qosient.com> wrote:
> > Hey Rodney,
> I found a bug in the code for rabins() when testing out this new  
> version
> of the clients that affects the examples that I sent earlier. Please  
> grab
> the latest code before trying out these new features!!!!
>
> >
>    ftp://qosient.com/dev/argus-3.0/argus-clients-3.0.2.beta.4.tar.gz
>
> >
> I added 95th percentile reporting to rahisto(), which is our  
> frequency distribution
> tool.  This tool calculates mean, stddev, max, min, and the median  
> (50th
> percentile), and so it was very easy to add 95th percentile to the  
> report.
>
> >
> You feed rahisto() the output of your 5 minute rabins() aggregations,
> and it will give you a little stats report of the specific variable,  
> and
> a frequency distribution of where the data falls.  If you don't know  
> what
> the range is, run it with just a small number of bins, and rahisto()  
> will start
> to show you where the data lies in its range.
>
> >
> Using the examples I used before:
>
> >
>        rabins -M rmon hard time 5m -m smac -r hourly.file -w /tmp/ 
> data.out
>
> >
> Now, run rahisto() instead of rasort(), this way to generate your  
> OutBound data
> for the specific ether address ('sload') :
>
> >
>    rahisto -r /tmp/data.out -H sload 10 - ether src host  
> 0:a0:c5:e1:7a:fa
>  N = 31      mean =  80407.974516  stddev =  67795.873860  max =  
> 174059.203125  min = 172.133331
>            median =  82742.617188     95% = 173895.046875
>  Class           Interval                Freq    Rel.Freq     Cum.Freq
>      1   0.000000e+00-1.740600e+04         12    38.7097%     38.7097%
>      2   1.740600e+04-3.481200e+04          0     0.0000%     38.7097%
>      3   3.481200e+04-5.221800e+04          3     9.6774%     48.3871%
>      4   5.221800e+04-6.962400e+04          1     3.2258%     51.6129%
>      5   6.962400e+04-8.703000e+04          1     3.2258%     54.8387%
>      6   8.703000e+04-1.044360e+05          0     0.0000%     54.8387%
>      7   1.044360e+05-1.218420e+05          1     3.2258%     58.0645%
>      8   1.218420e+05-1.392480e+05          3     9.6774%     67.7419%
>      9   1.392480e+05-1.566540e+05          4    12.9032%     80.6452%
>     10   1.566540e+05-1.740600e+05          6    19.3548%    100.0000%
>
> >
> And, run rahisto() this way to generate your InBound data for the  
> specific ether address ('dload') :
>
> >
>    rahisto -r /tmp/data.out -H dload 10 - ether src host  
> 0:a0:c5:e1:7a:fa
>  N = 31      mean = 2520065.831098  stddev = 2286667.779977  max =  
> 5742157.000000  min = 335.946655
>            median = 2935971.500000     95% = 5711441.500000
>  Class           Interval                Freq    Rel.Freq     Cum.Freq
>      1   0.000000e+00-5.742160e+05         13    41.9355%     41.9355%
>      2   5.742160e+05-1.148432e+06          1     3.2258%     45.1613%
>      3   1.148432e+06-1.722648e+06          1     3.2258%     48.3871%
>      4   1.722648e+06-2.296864e+06          1     3.2258%     51.6129%
>      5   2.296864e+06-2.871080e+06          0     0.0000%     51.6129%
>      6   2.871080e+06-3.445296e+06          2     6.4516%     58.0645%
>      7   3.445296e+06-4.019512e+06          1     3.2258%     61.2903%
>      8   4.019512e+06-4.593728e+06          3     9.6774%     70.9677%
>      9   4.593728e+06-5.167944e+06          3     9.6774%     80.6452%
>     10   5.167944e+06-5.742160e+06          6    19.3548%    100.0000%
>
> >
>
> >
> The numbers are slightly different from the last time, because of  
> the bug in rabins().
>
> >
> So you see, the data I'm using is multi-modally distributed, and  
> while very low in samples,
> it does suggest  an SLA for two tiers, one above 1M bps and one  
> below 1M bps.
> You can calculate a 95th percentile for the two regions, by  
> adjusting the range on the
> histogram option field like this (just do dload for this example):
>
> >
> Traffic Below 1M bps
>  rahisto -r /tmp/rabins.5m.out -H dload 10:0-1M - ip and ether src  
> host 0:a0:c5:e1:7a:fa
>  N = 14      mean = 220514.849217  stddev = 168862.596105  max =  
> 612549.625000  min = 335.946655
>            median = 157696.734375     95% = 612549.625000
>  Class           Interval                Freq    Rel.Freq     Cum.Freq
>      1   0.000000e+00-1.000000e+05          2    14.2857%     14.2857%
>      2   1.000000e+05-2.000000e+05          7    50.0000%     64.2857%
>      3   2.000000e+05-3.000000e+05          1     7.1429%     71.4286%
>      4   3.000000e+05-4.000000e+05          2    14.2857%     85.7143%
>      5   4.000000e+05-5.000000e+05          1     7.1429%     92.8571%
>      6   5.000000e+05-6.000000e+05          0     0.0000%     92.8571%
>      7   6.000000e+05-7.000000e+05          1     7.1429%    100.0000%
>      8   7.000000e+05-8.000000e+05          0     0.0000%    100.0000%
>      9   8.000000e+05-9.000000e+05          0     0.0000%    100.0000%
>     10   9.000000e+05-1.000000e+06          0     0.0000%    100.0000%
>
> >
> Traffic Above 1M bps
>  rahisto -r /tmp/rabins.5m.out -H dload 10:1-6M - ip and ether src  
> host 0:a0:c5:e1:7a:fa
>  N = 17      mean = 4413813.698529  stddev = 1253167.017151  max =  
> 5742157.000000  min = 1717778.125000
>            median = 4992350.500000     95% = 5742157.000000
>  Class           Interval                Freq    Rel.Freq     Cum.Freq
>      1   1.000000e+06-1.500000e+06          0     0.0000%      0.0000%
>      2   1.500000e+06-2.000000e+06          2    11.7647%     11.7647%
>      3   2.000000e+06-2.500000e+06          0     0.0000%     11.7647%
>      4   2.500000e+06-3.000000e+06          1     5.8824%     17.6471%
>      5   3.000000e+06-3.500000e+06          1     5.8824%     23.5294%
>      6   3.500000e+06-4.000000e+06          1     5.8824%     29.4118%
>      7   4.000000e+06-4.500000e+06          1     5.8824%     35.2941%
>      8   4.500000e+06-5.000000e+06          4    23.5294%     58.8235%
>      9   5.000000e+06-5.500000e+06          3    17.6471%     76.4706%
>     10   5.500000e+06-6.000000e+06          4    23.5294%    100.0000%
>
> >
> Hope this is helpful,
>
> >
> Carter
>
> >
> On Mar 23, 2009, at 3:16 PM, Rodney McKee wrote:
>
> > Carter,
> >
> > The value I'm after is based on 5 minute samples of user uploads  
> (inbound traffic) that are sorted highest to lowest then the value  
> at the 95th percent point is then used for our volume calculation.  
> Is their a way to pull the 5 minutes samples like "ragraph -M 5m"  
> using the text "ra" tools?
> > I just see the rrdgraph is able to do it but not sure yet how to  
> use it.
> >
> > More info on the billing scheme here:
> > http://en.wikipedia.org/wiki/Burstable_billing
> >
> >
>
> -- 
>
>
>

Carter Bullard
CEO/President
QoSient, LLC
150 E 57th Street Suite 12D
New York, New York  10022

+1 212 588-9133 Phone
+1 212 588-9134 Fax



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20090402/4432b99f/attachment.html>


More information about the argus mailing list