Why is rabins() "ramping up" counts?
Matt Brown
matthewbrown at gmail.com
Tue Jul 30 16:19:41 EDT 2013
Hello all,
Does rabins() "ramp up to normal" over N bins?
I'd like to start working on calculating moving averages to help
identify performance outliers (like "spikes" in `loss` or `rate`).
For this purpose, I believe grabbing data from the output of rabins()
would serve me well.
For example, if I take historic argus data and run it through the
following rabins() invocation, I see some odd things that can only be
noted as "ramping up":
for f in $(ls -m1 ~/working/*) ; do (
rabins -M hard time 5s -B 5s -r $f -m saddr -s ltime rate - port 5432
and src host 192.168.10.22
) >> ~/aggregated_rate ; done
The first few and the last few resulting records per file seem to not
be reporting correctly.
For example, these dudes at 192.168.10.22 utilize a postgres DB
replication package called bucardo. During idle time, bucardo sends
heartbeat info, and appears to be holding at about 47-49 packets per
second (rate).
However, I am seeing the following in my rabins() resultant data (note
the precense of field label header == the start of a new rabins() from
the above for..loop):
2013-07-25 00:59:25.000000 47.400000
2013-07-25 00:59:30.000000 47.400000
2013-07-25 00:59:35.000000 48.000000
2013-07-25 00:59:40.000000 48.000000
2013-07-25 00:59:45.000000 40.600000
2013-07-25 00:59:50.000000 21.400000
2013-07-25 00:59:55.000000 15.400000
2013-07-25 01:00:00.000000 5.000000
2013-07-25 01:00:05.000000 0.000000
LastTime Rate
2013-07-25 01:00:05.000000 0.200000
2013-07-25 01:00:10.000000 0.600000
2013-07-25 01:00:15.000000 0.400000
2013-07-25 01:00:35.000000 0.400000
2013-07-25 01:00:40.000000 1.000000
2013-07-25 01:00:45.000000 6.200000
2013-07-25 01:00:50.000000 25.400000
2013-07-25 01:00:55.000000 32.400000
2013-07-25 01:01:00.000000 41.800000
2013-07-25 01:01:05.000000 47.600000
2013-07-25 01:01:10.000000 48.600000
[The source files were written with rastream().]
It is well worth noting that if I start an rabins() reading from the
argus() socket with the following invocation, the same sort of thing
occurs:
# rabins -M hard time 5s -B 5s -S 127.0.0.1:561 -m saddr -s ltime rate
- port 5432 and src host 192.168.10.22
LastTime Rate
2013-07-30 15:42:55.000000 1.400000
2013-07-30 15:43:00.000000 0.600000
2013-07-30 15:43:05.000000 33.800000
2013-07-30 15:43:10.000000 47.400000
2013-07-30 15:43:15.000000 58.600000
2013-07-30 15:43:20.000000 87.600000
2013-07-30 15:43:25.000000 96.200000
2013-07-30 15:43:30.000000 96.000000
2013-07-30 15:43:35.000000 134.200000
2013-07-30 15:43:40.000000 137.200000
2013-07-30 15:43:45.000000 137.400000
2013-07-30 15:43:50.000000 136.600000
2013-07-30 15:43:55.000000 139.800000
2013-07-30 15:44:00.000000 136.200000 <-- `rate` averages about here
going forward
It's irrelevant which field I utilize, the same instance occurs:
# rabins -M hard time 5s -B 5s -S 127.0.0.1:561 -m saddr -s ltime load
- port 5432 and src host 192.168.10.22
LastTime Load
2013-07-30 15:50:15.000000 1461.19*
2013-07-30 15:50:20.000000 42524.7*
2013-07-30 15:50:25.000000 54329.5*
2013-07-30 15:50:30.000000 55244.8*
2013-07-30 15:50:35.000000 90164.8*
2013-07-30 15:50:40.000000 92539.1*
2013-07-30 15:50:45.000000 94827.1*
2013-07-30 15:50:50.000000 95292.7*
2013-07-30 15:50:55.000000 96286.3*
2013-07-30 15:51:00.000000 94857.6*
2013-07-30 15:51:05.000000 130699.*
2013-07-30 15:51:10.000000 149979.*
2013-07-30 15:51:15.000000 149320.*
[killed]# rabins -M hard time 5s -B 5s -S 127.0.0.1:561 -m saddr -s
ltime load - port 5432 and src host 192.168.2.22
LastTime Load
2013-07-30 15:52:35.000000 33894.4*
2013-07-30 15:52:40.000000 3134.84*
2013-07-30 15:52:45.000000 39262.4*
2013-07-30 15:52:50.000000 40024.0*
2013-07-30 15:52:55.000000 41188.7*
2013-07-30 15:53:00.000000 40259.2*
2013-07-30 15:53:05.000000 75057.6*
2013-07-30 15:53:10.000000 97160.0*
2013-07-30 15:53:15.000000 106520.*
2013-07-30 15:53:20.000000 138504.*
2013-07-30 15:53:25.000000 153835.*
2013-07-30 15:53:30.000000 152892.*
2013-07-30 15:53:35.000000 154017.* <-- `load` averages here going forward
This happens whether or not I perform field aggregation (`-m saddr`).
Why is this happening?
This seems like it will really screw up calculating moving averages
(figuring out spikes, etc.) from the rabins() resultant data.
Thanks!
Matt
More information about the argus
mailing list