[flow-tools] re: has anyone tried this before / know how possible it is?
Andrew Fort
afort@choqolat.org
Fri, 31 Jan 2003 20:13:46 +1100
Craig A. Finseth wrote, replying to Will Lotto:
> ...
> >This way either one of the collectors can fail, and the
> >second will still collect streams, then the merge process can
> >happen at any time.
>
> This would be a good thing.
> ...
>
> While your overall approach is reasonable, consider whether:
>
> 1) you are really improving the overall reliability, and
> 2) does it matter?
>
> For (1), the Cisco can, of course, send to two hosts as easily as one.
> However, you have added a new step: merging the two sets of flow files
> and eliminating the duplicates. How reliable is that process? How
> maintainable is it? After all, you've introduced a whole new set of
> ways for it to fail.
>
> For (2), determine the cost if you happen to miss a few flows, or even
> a day or two's worth in case a machine breaks. On the one hand, it
> could be part of a service guarantee, such that you have to rebate
> money (or equivalent) to a customer if it goes down. On the other, it
> could just be a "nice to look at, but if we miss a few minutes' data,
> it won't really matter."
Wise words;
If you're worried about losing flowdata and aren't prepared for when it
happens, dont bill customers using it; missing a few flow exports is the
least of your problems :). Weigh up the flexibility it offers you against
the complexity of collection and aggregation of that data, because as you
grow, your datasets get enormous; rough figures we see are about ~6mbit/sec
of flowdata in a POP using about 100Mbit/sec of ingress traffic given fairly
heterogenous flows. When you're getting a flow-happy (random addresses,
ports) attack, expect lots more. Aggregate your data on the collector
before export to the billing system.
Netflow data is great, but when it comes to billing, other options may be
simpler to implement and less difficult to understand, and possibly even
easier for your customers to agree with the results of. If your model
doesn't require the flexibility (like, different rates for peer/provider
traffic based on peer-AS, etc), consider other options.
-afort