Why is rabins() "ramping up" counts?

Tue Jul 30 16:19:41 EDT 2013

Hello all,

Does rabins() "ramp up to normal" over N bins?

I'd like to start working on calculating moving averages to help
identify performance outliers (like "spikes" in `loss` or `rate`).

For this purpose, I believe grabbing data from the output of rabins()
would serve me well.

For example, if I take historic argus data and run it through the
following rabins() invocation, I see some odd things that can only be
noted as "ramping up":

for f in $(ls -m1 ~/working/*) ; do (
rabins -M hard time 5s -B 5s -r $f -m saddr -s ltime rate - port 5432
and src host 192.168.10.22
) >> ~/aggregated_rate ; done

The first few and the last few resulting records per file seem to not
be reporting correctly.

For example, these dudes at 192.168.10.22 utilize a postgres DB
replication package called bucardo.  During idle time, bucardo sends
heartbeat info, and appears to be holding at about 47-49 packets per
second (rate).

However, I am seeing the following in my rabins() resultant data (note
the precense of field label header == the start of a new rabins() from
the above for..loop):

2013-07-25 00:59:25.000000    47.400000
2013-07-25 00:59:30.000000    47.400000
2013-07-25 00:59:35.000000    48.000000
2013-07-25 00:59:40.000000    48.000000
2013-07-25 00:59:45.000000    40.600000
2013-07-25 00:59:50.000000    21.400000
2013-07-25 00:59:55.000000    15.400000
2013-07-25 01:00:00.000000     5.000000
2013-07-25 01:00:05.000000     0.000000
                 LastTime         Rate
2013-07-25 01:00:05.000000     0.200000
2013-07-25 01:00:10.000000     0.600000
2013-07-25 01:00:15.000000     0.400000
2013-07-25 01:00:35.000000     0.400000
2013-07-25 01:00:40.000000     1.000000
2013-07-25 01:00:45.000000     6.200000
2013-07-25 01:00:50.000000    25.400000
2013-07-25 01:00:55.000000    32.400000
2013-07-25 01:01:00.000000    41.800000
2013-07-25 01:01:05.000000    47.600000
2013-07-25 01:01:10.000000    48.600000

[The source files were written with rastream().]

It is well worth noting that if I start an rabins() reading from the
argus() socket with the following invocation, the same sort of thing
occurs:
# rabins -M hard time 5s -B 5s -S 127.0.0.1:561 -m saddr -s ltime rate
- port 5432 and src host 192.168.10.22
                 LastTime         Rate
2013-07-30 15:42:55.000000     1.400000
2013-07-30 15:43:00.000000     0.600000
2013-07-30 15:43:05.000000    33.800000
2013-07-30 15:43:10.000000    47.400000
2013-07-30 15:43:15.000000    58.600000
2013-07-30 15:43:20.000000    87.600000
2013-07-30 15:43:25.000000    96.200000
2013-07-30 15:43:30.000000    96.000000
2013-07-30 15:43:35.000000   134.200000
2013-07-30 15:43:40.000000   137.200000
2013-07-30 15:43:45.000000   137.400000
2013-07-30 15:43:50.000000   136.600000
2013-07-30 15:43:55.000000   139.800000
2013-07-30 15:44:00.000000   136.200000 <-- `rate` averages about here
going forward

It's irrelevant which field I utilize, the same instance occurs:
# rabins -M hard time 5s -B 5s -S 127.0.0.1:561 -m saddr -s ltime load
- port 5432 and src host 192.168.10.22
                 LastTime     Load
2013-07-30 15:50:15.000000 1461.19*
2013-07-30 15:50:20.000000 42524.7*
2013-07-30 15:50:25.000000 54329.5*
2013-07-30 15:50:30.000000 55244.8*
2013-07-30 15:50:35.000000 90164.8*
2013-07-30 15:50:40.000000 92539.1*
2013-07-30 15:50:45.000000 94827.1*
2013-07-30 15:50:50.000000 95292.7*
2013-07-30 15:50:55.000000 96286.3*
2013-07-30 15:51:00.000000 94857.6*
2013-07-30 15:51:05.000000 130699.*
2013-07-30 15:51:10.000000 149979.*
2013-07-30 15:51:15.000000 149320.*
[killed]# rabins -M hard time 5s -B 5s -S 127.0.0.1:561 -m saddr -s
ltime load - port 5432 and src host 192.168.2.22
                 LastTime     Load
2013-07-30 15:52:35.000000 33894.4*
2013-07-30 15:52:40.000000 3134.84*
2013-07-30 15:52:45.000000 39262.4*
2013-07-30 15:52:50.000000 40024.0*
2013-07-30 15:52:55.000000 41188.7*
2013-07-30 15:53:00.000000 40259.2*
2013-07-30 15:53:05.000000 75057.6*
2013-07-30 15:53:10.000000 97160.0*
2013-07-30 15:53:15.000000 106520.*
2013-07-30 15:53:20.000000 138504.*
2013-07-30 15:53:25.000000 153835.*
2013-07-30 15:53:30.000000 152892.*
2013-07-30 15:53:35.000000 154017.* <-- `load` averages here going forward

This happens whether or not I perform field aggregation (`-m saddr`).

Why is this happening?

This seems like it will really screw up calculating moving averages
(figuring out spikes, etc.) from the rabins() resultant data.

Thanks!

Matt