new graph of the week

Carter Bullard carter at qosient.com
Mon Oct 16 19:35:59 EDT 2006


Gentle people,
I put up an aggregation study, using ragraph(), to show that when you  
take
a bunch of flow data and you aggregate it with varying time  
resolutions, you
seem to get the right output.

The essence of the graphs is to show that the metric distribution logic
doesn't cause artifacts such as:

    1.  aliasing drift, where rounding errors are propagated through
          the time series data, where at the end you get weird peaks.

    2. rounding errors, where you get peaking and/or data dropout
        with some frequency specific conversions.

    3. stack shifting, where breaking up data shifts chunks of the  
packet
        stats into the adjacent bin.  this can cause abrupt cyclic/ 
periodic
        artifacts.

The data actually does have interesting behavior, as its about 1750  
flows
between two machines with a lot of overlap, where one or two ssh flows
are ongoing and another cranks up.  So the peaks that emerge when you
go to 5-60 second aggregation are real  cyclic peaks.  While
this is an aliasing artifact its not an error.    Most of the flows  
last about 3.6 seconds,
and move 13-15 packets and about 1-2k worth of data, and the flow  
themselves
arrive at,  what, every 3.4 seconds in one period, and about every  
4.8 seconds
in another, so it has some interesting behavior.

The metric distribution strategy is very simple.  Assumption is that the
data is uniformly distributed over the entire flow duration, this is  
of course
incorrect, but useable.  We also propagate packets as integers, so  
that we
don't generate artificial smoothing, by counting fractions of packets.

While we have the data to do a better job (syn/synack/ack
volley should be packed in the first milliseconds and we have the  
largest
idle chunk in both directions so we know where to insert dead space)
this is a good model, and less complicated.  Anyway, I'll work on a  
better
distribution strategy as we go along.

I'm sure there are other artifacts that can occur that we're dealing  
with,
these are just a few that come to mind.   If there are any specific  
issues
that you would like tested, just holler, and if there are graphing  
strategies
that you would like to see, also any suggestion is welcome.

Carter




More information about the argus mailing list