Why is rabins() "ramping up" counts?

Carter Bullard carter at qosient.com
Wed Jul 31 23:23:55 EDT 2013


Matt,

The reason you are having problems is the " -B 5s " option.

Don't use it when reading data from a file.  Its not intended
for this use, and while it won't hurt anything, it represents
a lack of understanding of the option.

When you use it with a live argus data source, it must be
greater than the ARGUS_FLOW_STATUS_INTERVAL to have its effect.

Lets try to understand what this option is doing.  The " -B secs "
option is saying how long you have to hold a bin in memory
in order to guarantee that all the data arrives for that time
bin.

With these options " -M time 5s -B 5s ", you will process
only 2 bins at a time, the current bin, who's range is
[now - (now + 5s)] and the hold buffer, which is [(now - 5s) - now].

With an ARGUS_FLOW_STATUS_INTERVAL=60s, you will receive data
whose time range could be [(now - 60s) - now].  rabins(), when
carving up the record into 5 second chunks, will generate, at
most 12 records whose ranges are:
   1. [((now - 60s)+0*5s) - ((now - 60s)+1*5s)]
   2. [((now - 60s)+1*5s) - ((now - 60s)+2*5s)]
   3. [((now - 60s)+2*5s) - ((now - 60s)+3*5s)]
…
  12. [((now - 60s)+11*5s) - ((now - 60s)+12*5s)]

With a " -B 5s " option, only the 12th record will have a slot
to aggregate into.  So you will be throwing away 11/12 of
all your data.

Increase your "-B sec" option to 
   (max(ARGUS_FLOW_STATUS_INTERVAL) + someDelay)

So I would use at a minimum " -B 65s ".


OK, lets try to keep the email short in the future,
if its feasible to do so.  One topic at a time…

Carter



On Jul 30, 2013, at 4:19 PM, Matt Brown <matthewbrown at gmail.com> wrote:

> Hello all,
> 
> 
> Does rabins() "ramp up to normal" over N bins?
> 
> 
> I'd like to start working on calculating moving averages to help
> identify performance outliers (like "spikes" in `loss` or `rate`).
> 
> For this purpose, I believe grabbing data from the output of rabins()
> would serve me well.
> 
> 
> For example, if I take historic argus data and run it through the
> following rabins() invocation, I see some odd things that can only be
> noted as "ramping up":
> 
> 
> for f in $(ls -m1 ~/working/*) ; do (
> rabins -M hard time 5s -B 5s -r $f -m saddr -s ltime rate - port 5432
> and src host 192.168.10.22
> ) >> ~/aggregated_rate ; done
> 
> 
> The first few and the last few resulting records per file seem to not
> be reporting correctly.
> 
> For example, these dudes at 192.168.10.22 utilize a postgres DB
> replication package called bucardo.  During idle time, bucardo sends
> heartbeat info, and appears to be holding at about 47-49 packets per
> second (rate).
> 
> However, I am seeing the following in my rabins() resultant data (note
> the precense of field label header == the start of a new rabins() from
> the above for..loop):
> 
> 2013-07-25 00:59:25.000000    47.400000
> 2013-07-25 00:59:30.000000    47.400000
> 2013-07-25 00:59:35.000000    48.000000
> 2013-07-25 00:59:40.000000    48.000000
> 2013-07-25 00:59:45.000000    40.600000
> 2013-07-25 00:59:50.000000    21.400000
> 2013-07-25 00:59:55.000000    15.400000
> 2013-07-25 01:00:00.000000     5.000000
> 2013-07-25 01:00:05.000000     0.000000
>                 LastTime         Rate
> 2013-07-25 01:00:05.000000     0.200000
> 2013-07-25 01:00:10.000000     0.600000
> 2013-07-25 01:00:15.000000     0.400000
> 2013-07-25 01:00:35.000000     0.400000
> 2013-07-25 01:00:40.000000     1.000000
> 2013-07-25 01:00:45.000000     6.200000
> 2013-07-25 01:00:50.000000    25.400000
> 2013-07-25 01:00:55.000000    32.400000
> 2013-07-25 01:01:00.000000    41.800000
> 2013-07-25 01:01:05.000000    47.600000
> 2013-07-25 01:01:10.000000    48.600000
> 
> [The source files were written with rastream().]
> 
> 
> It is well worth noting that if I start an rabins() reading from the
> argus() socket with the following invocation, the same sort of thing
> occurs:
> # rabins -M hard time 5s -B 5s -S 127.0.0.1:561 -m saddr -s ltime rate
> - port 5432 and src host 192.168.10.22
>                 LastTime         Rate
> 2013-07-30 15:42:55.000000     1.400000
> 2013-07-30 15:43:00.000000     0.600000
> 2013-07-30 15:43:05.000000    33.800000
> 2013-07-30 15:43:10.000000    47.400000
> 2013-07-30 15:43:15.000000    58.600000
> 2013-07-30 15:43:20.000000    87.600000
> 2013-07-30 15:43:25.000000    96.200000
> 2013-07-30 15:43:30.000000    96.000000
> 2013-07-30 15:43:35.000000   134.200000
> 2013-07-30 15:43:40.000000   137.200000
> 2013-07-30 15:43:45.000000   137.400000
> 2013-07-30 15:43:50.000000   136.600000
> 2013-07-30 15:43:55.000000   139.800000
> 2013-07-30 15:44:00.000000   136.200000 <-- `rate` averages about here
> going forward
> 
> 
> It's irrelevant which field I utilize, the same instance occurs:
> # rabins -M hard time 5s -B 5s -S 127.0.0.1:561 -m saddr -s ltime load
> - port 5432 and src host 192.168.10.22
>                 LastTime     Load
> 2013-07-30 15:50:15.000000 1461.19*
> 2013-07-30 15:50:20.000000 42524.7*
> 2013-07-30 15:50:25.000000 54329.5*
> 2013-07-30 15:50:30.000000 55244.8*
> 2013-07-30 15:50:35.000000 90164.8*
> 2013-07-30 15:50:40.000000 92539.1*
> 2013-07-30 15:50:45.000000 94827.1*
> 2013-07-30 15:50:50.000000 95292.7*
> 2013-07-30 15:50:55.000000 96286.3*
> 2013-07-30 15:51:00.000000 94857.6*
> 2013-07-30 15:51:05.000000 130699.*
> 2013-07-30 15:51:10.000000 149979.*
> 2013-07-30 15:51:15.000000 149320.*
> [killed]# rabins -M hard time 5s -B 5s -S 127.0.0.1:561 -m saddr -s
> ltime load - port 5432 and src host 192.168.2.22
>                 LastTime     Load
> 2013-07-30 15:52:35.000000 33894.4*
> 2013-07-30 15:52:40.000000 3134.84*
> 2013-07-30 15:52:45.000000 39262.4*
> 2013-07-30 15:52:50.000000 40024.0*
> 2013-07-30 15:52:55.000000 41188.7*
> 2013-07-30 15:53:00.000000 40259.2*
> 2013-07-30 15:53:05.000000 75057.6*
> 2013-07-30 15:53:10.000000 97160.0*
> 2013-07-30 15:53:15.000000 106520.*
> 2013-07-30 15:53:20.000000 138504.*
> 2013-07-30 15:53:25.000000 153835.*
> 2013-07-30 15:53:30.000000 152892.*
> 2013-07-30 15:53:35.000000 154017.* <-- `load` averages here going forward
> 
> This happens whether or not I perform field aggregation (`-m saddr`).
> 
> 
> Why is this happening?
> 
> 
> This seems like it will really screw up calculating moving averages
> (figuring out spikes, etc.) from the rabins() resultant data.
> 
> 
> Thanks!
> 
> Matt
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 6837 bytes
Desc: not available
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20130731/8d0c487c/attachment.bin>


More information about the argus mailing list