n00b Questions

Fri Aug 28 17:42:28 EDT 2009

On Fri, Aug 28, 2009 at 12:53:14AM -0600, John Kennedy wrote:
> While reading the argus website for System Auditing, it got me thinking;

	Always a good thing :-)

> With multiple ways to collect analyze and store Argus data, I am curious how
> some have tackled the collection, processing, management and storage of it?
> I am always curious when it comes to how others do it because like
> programming there is almost always more than one way to do it.  I would also
> like to find out if there are ways in which I could be more efficient.
> 
> I use argus strictly for Network Security Monitoring.  In an ArcSight
> webinar I attended the other day the presenter said "Your business paints a
> picture everyday... is anyone watching" For me, argus helps connect the dots
> in order to see the picture(s).  I could throw many more analogies here, but
> I think you get the point.
> 
> It has come time for me to refresh some of the hardware that argus is
> running on.  In order to effectively put together a proposal that will meet
> the needs of my monitoring efforts for the enterprise, I would like to
> understand a little about how those on this list are deploying argus.

	Not enough information here to really help much. Link speeds and 
utilization would be a good starting point as it affects what hardware you
need. That said I will include some pointers that haven't yet made the web
pages. I'll us my former employer's installation as an example: two links
one to commodity capped at ~ 200 megabits per second and a clear channel 
gig to CA*net (the Canadian version of Internet2 but typical utilization 
~ 80 megs unless atlas testing is occurring :-)). When I retired we had 2 
terabytes of argus data (and then migrated to tape) on the analysis machine. 
That left about 6 to 8 months of data online. Perl scripts (getting old now
and no longer maintained) process the data on an hourly basis and then do 
a summary of the last 24 hours early in the morning. Someone (perhaps :-))
reads that and takes action as required. 

> 
> For me processing the data is the hardest hurdle i have to overcome each
> day.  The server in which I run the reporting from is on a dual core
> processor with 2 gigs of ram and 500 Gigs of storage.  Is this typical?
> Retention is also an issue.  On my sensors I run argus and write the data to
> a file. 

	About here alarm bells go off :-). Depending on your link speeds and 
utilization some or all of this applies to you (i.e. you may be losing packets
without being aware of it):

        Argus will run on a single machine archiving data to local
disk. For an evaluation of argus to see if it is useful to you
this is likely to be the approriate place to start so pretty much any fairly
recent machine with an Intel NIC card and unix of some flavor (what ever you
are comfortable with) will do. As argus is perfecly happy to combine the
output from two NICs if you are monitoring a fdx link with a tap. A command
like this one will do what you need (after installing argus and its clients):

/usr/local/bin/argus -i nfe0 -i nfe1 -dmJRU 512 -w /var/log/argus/argus

(replacing the nfe0 and nfe1 with the approriate NIC ids for your system)

This will run argus writing the archived data to /var/log/argus/argus. The
argus man page will explain what all the switches do (so modify them to taste
and privacy concerns). This assumes that the monitored link is connected to
a network tap and the recieve monitor output from the tap connects to NIC nfe0
and the transmit monitor output of the tap connects to NIC nfe1 (or visa versa
order doesn't matter). For operation on a span port (where transmit and recieve
are combined and your monitored link utilization thus needs to be less than %50)
remove the "-i nfe1".
        Since this example is a FreeBSD machine, as root you also need to do
this:

sysctl net.bpf.bufsize=524288

which boosts the bpf buffer from its default 4k value to half a meg or so to
prevent packet loss. At this point I don't know the equivelent commmand on
Linux (I use pf-ring on Linux so this isn't an issue although many other
things are).
        Note that with this setup somewhere in the 30 to 50 megabit link
utilization range you will begin losing packets. For an initial evaluation
that isn't likely to be important, but when you discover argus is invaluable
you will probably want to read further to reduce your packet loss.
        The first step up the performance ladder (where cost increases with
performance very likely) is to separate the sensor and archiving machines.
It is believed (but not yet verified) that disk I/O for archiving is eating
bus / memory bandwidth and causing the packet loss. The first solution to
this is to run separate sensor and archive machines. The sensor machine
(which needs to be the fastest and should preferably be small endian)
has the NIC cards and runs argus configured to write data to a socket.
One or two (depending on the link as above) monitor the input link and
a second or third NIC connects the sensor machine to the archive machine.
In my experience there is around a 400 to 1 reduction in link utilization
to argus output data (this may be smaller on 3.0 which collects more data,
        Continuing the example above we add a third NIC nfe2 and give it an
IP address (the two monitor NICs don't need an IP and are better off without
one) which is also the default route for the machine. Either a cross over cable
or a connection to a network switch connects it to the archive machine on a
second host. The configuration looks like this:

                                | sensor machine   |        | archive machine |
network tap --  rx monitor port - nfe0        nfe2 - --- -  nfe0
                                           192.168.1.1   192.168.1.2
            --  tx monitor port - nfe1

on the sensor machine argus runs as:

/usr/local/bin/argus -i nfe0 -i nfe1 -dmJRU 512 -B192.168.1.1 -P 561

which writes the argus data to a socket rather than disk. On the archive
machine (which as noted can be much less powerful than the sensor machine)
the ra argus client is used to recieve and archive the argus data to disk
via a command like this (note this is ancient and now obsolete, one of the
new clients should be used instead but the theory is the same, don't write to
disk on sensor machines):

/usr/local/bin/ra -S192.168.1.1:561 -w /var/log/argus/argus &

This writes the argus data to file /var/log/argus/argus where the
argusarchive shell script running from cron will move the data to the archive.
	The best way to assess this is use something like tcpreplay or another
traffic generator to create a known amount and speed of traffic on a test setup
and see how much loss you see (and then the fun of finding where the loss is
occurring starts :-)). 

>      Every hour I have a script that takes the file compresses it and
> copies it to an archive. Every 4 hours I rsync it to the server.  On the
> server I have some scripts that process the last four hours of files that
> were just Rsynced.  I realize that I could use radium() to save files to my
> server; however with only a 500 gig RAID it gets a little tight with 5
> sensors. I keep archives on the sensors themselves to aid in some retention.
> The sensors by-the-way have a 200 Gig RAID.  When I first was working with
> argus and finding equipment to use. I was sure that 500 gig would be
> plenty... It's 500 gig, for crying out loud.

	When I started using argus (15 or more years ago) a 2 gig SCSI disk 
was a couple of grand. These days terabyte drives are under $200 here (
probably under $100, I haven't looked lately :-)). Disk is cheap. It isn't 
clear that raid buys you a lot as disk I/O (as long as it isn't on the sensor
machine) probably isn't the limiting factor unless your link speed/utilization
is high but that said, test test test!
> 
> So, give a n00b some feedback.
> 
> Thanks
> 
> John

Peter Van Epp