Some questions on architecture basics for argus collection

Fri Aug 24 11:18:10 EDT 2012

A few questions about how to architect a relatively simple argus
deployment...A great many of these questions will likely depend on
underlying hardware and traffic volumes being processed...Let's assume
that each 'path' being monitored is handling (for long peaks) 1 Gb/s
of traffic, and the hardware is reasonable...In cases where this
really matters if you wouldn't mind pointing out that this is a
controlling factor it would be much appreciated...

On a server where multiple interfaces need to be monitored (let's say
2 pairs of taps connections = 4 physical interfaces), which of the
following would you recommend?

1) Configure one argus server for each pair, with the interface
specified like "ARGUS_INTERFACE=bond:eth1,eth2", and run a radium on
the SAME box to collect from the two servers
2) Same inferface config as 1), but move the radium program off host
and have it connect to both servers?
3) Configure one argus server for both pair, with two interface config
lines like "ARGUS_INTERFACE=bond:eth1,eth2" and
"ARGUS_INTERFACE=bond:eth3,eth4" in argus.conf and move the radium off
host?

If one of the goals is to keep a copy of unfiltered data on the host,
and then have filtering done on clients connecting to radium:

1) If the local (on sensor) argus specifies an ARGUS_OUTPUT_FILE
option, is it possible to have it automatically split up files like
rasplit does, or should an instance of rasplit be run and have it
connect to the local server? Will this interfere with radium
collecting records as well (or stated another way, how many
connections from clients can an argus server handle)?

2) What are the performance implications of using
RADIUM_CORRELATE="yes"? For instance, let's say we're monitoring two
paths to the internet which are in an active-active state (and
promises of "no asymmetrical routing" have been made), and also
pulling in Cisco netflow from an internal router and writing out the
combined flows to file; if asymmetrical routing occurs would this
setting allow radium to deal gracefully with this issue from a flow
perspective (putting the correct pieces together into a single
bi-directional flow), and can records be audited for occurrences of
this if each path has it's own monitor-id? What would happen when the
same flow is seen via Netflow as is seen on the argus instance in
terms of the data kept in the record? Would the richer argus data be
kept, or would the Netflow information be kept?

3) If you wish to have radium perform labeling via specifying
RADIUM_CLASSIFIER_FILE, are there performance hits at some point as
the size of the classifier grows, and how big would a classifier file
have to be before performance was impacted? Or would it be more
related to how complex the labeling requirements were?

4) How well does radium deal with disk IO wait? Which is to say, if
the plan is to have multiple client programs connected to a radium
instance locally, and writing out files locally as well, does radium
have any built in buffering strategies in case disk IO becomes a
bottleneck? I assume that the solution would be to move those clients
off to their own hosts, but I wanted to have an idea of where one
might lose data if this became an issue, i.e., would radium drop it?
Would the client (ra, rasplit, etc) drop it? Would client and server
hold everything in memory until memory was exhausted and crash the
box?

Thanks for patience on these questions,

Jesse

-- 
Jesse Bowling