[ARGUS] argus configuration here ...

Thu Oct 14 12:27:19 EDT 2004

	Mike asked off list if I'd share my current config, and since I expect
perhaps others may be interested I'll do it on list. We have 3 major campuses
around town (and another 3 or 4 more minor presences) pretty much fully 
interconnected with dark fibre. We are also one of the 5 owners of the local
regional optical network (who technically owns the dark fibre) which in turn
is currently the GigaPop operator for the CA*net4 presence in BC. The Gigapop
resides in space leased at our downtown campus. We have a clear channel Gig
link in to the Gigapop and thus CA*net4/I2 in the US and a 100 meg connection 
with a packeteer on it to control cost :-) in the to the transit exchange
at the same place for commodity traffic. As well we have a grid computing 
research project that houses a 200 terabyte file store and some smaller 
(~192 nodes beowulf cluster, an IBM G4 blade server, soon a Cray) here, and
major computing nodes (1500 node beowulf cluster, large SGI, large Alpha 
I think) at other Universities both here in town and in Alberta (one province
over). They interconnect on a current 1 gig light path via C4 (the C4 backbone
is OC192) on its way to 10 gigs at least locally. That link is currently 
"testing" one of my argus boxes to make sure my Syskonnect cards are OK (they 
had borrowed my machine and had trouble with Linux and the SysKonnect cards,
they obviously need to upgrade to a real operating system :-)) and should be 
the most interesting because it has been known to peak at ~950 megabits per 
second (although it hasn't done so since argus has been on). As well one of 
our remote campuses doesn't yet have fibre and so is limping along on a 10 meg 
FDX circuit and have their own standalone argus watching it.  So lets start 
with hardware. I have two identical sensor boxes which consist of Tyan Thunder 
motherboards with dual 1.4 gig Athelon processors and 500 megs or RAM. Each of 
those has 2 SysKonnect sx fibre gig cards in the 64 bit PCI slots, a 3c905b in 
a slot for connectivity and dual on board 3c905b used for monitoring the 
commodity link. This particular motherboard was bought because a benchmark
from a US university achieved ~950 megabits per second (which we have been able
to regularly duplicate I was able to get up to about 1.6 gig FDX during tests,
I think the limitation is likely memory bandwith but may be interrupt load) 
with this motherboard/card combo. It isn't cheap, but it is capable of wire 
speed and I highly recommend them (DAC cards are probably even better, but 
more expensive yet :-)).
	The original intent was one box for C4 and one box for commodity, but 
traffic has been light enough so far that a single sensor box does both 
commodity and C4 at the moment. The links are both tapped (a netoptics fibre 
tap for C4, and a finsair (nee shomoti) 10/100 Century tap for commodity about 
to be replaced by dual 4 port netoptics regen taps) and feed the SysKonnect 
cards and the 2 on board NICs for FDX operation on the first sensor machine. 
That machine runs argus_bpf (on FreeBSD 4.10-RELEASE) writing to a socket (no 
local archiving to disk for performance reasons) and basically loafs :-). 
Beside it is a P3 600 Meg machine with 750 Megs of ram and a couple of 3c905B 
cards. One card connects via a crossover cable to the sensor box 3c905b in the 
expansion slot (so external network events don't affect argus data capture). 
This machine has an 80 gig IDE disk and is the primary archive host. A copy of 
ra (daemonized by a quick and dirty perl script) listens on the crossover 
cable interface and argusarchive runs out of cron to cycle the logs every 
hour. On both machines. perl scripts running out of cron check that the 
appropriate argus tasks are around and will kill and restart them if they
look to have gone west in a variety of interesting ways. On the archive host
there are more perl scripts (that argusarchive triggers) which establish an
ssh connection to third machine in my office with 400 gigs of IDE disk and
transfers (reasonably reliably, although it still needs some more work) the 
archive files to a second archive for long term storage (backed up to our 
robot tape library for archival storage). All the perl scripts that post 
process the data for reports run on this third machine so screw ups or running
the machine out of memory don't affect the production sensor/archive pair. As
well because the post processing machine is a fibre link (and some 20 KM) from
the sensors, should the link go down, the archive machine beside the sensor
will happily puff its cheeks up with the data and then spit it all out 
(assuming it hasn't run out of disk which is fairly unlikely) when the link
comes back up.
	As of last weekend my other Athelon box is connected kluged (via our
multiport monitoring fibre tap at the moment until I get a dedicated second 
tap or the regen taps come) in to the gig link for the grid computing network.
That sensor is identical (except without 100 interfaces) to the one downtown
but its ra archive is running on my post processing box rather than a dedicated
archiving box because it is mostly experimental. I have hacked argusarchive 
so that it takes an instance name and stores the data for all the various 
instances in their own archives, and the post processing perl scripts all
deal with instance names (and use locking to serialize post processing to 
avoid blowing the machines out of memory most of the time). The initial 
versions of all the scripts are available via anon ftp at  ftp.sfu.ca  in 
/pub/unix/argus/argus.traffic.perl.tar.gz
Once I get more bugs and performance issues resolved I'll do a further release
(the interhost transfer code isn't currently there for instance)..
	As noted earlier every hour when the archive cycles a perl script 
is kicked off on the post processing host which aquires traffic and scan
information from the last hour's data and writes it to a spool directory.
Each morning at 6 AM another perl script kicks off and postprocessess the 
hourly summary files in to a set of reports covering the entire day (currently
not including port scans which are still being worked on) for commodity, C4 
and now Westgrid argus instances. It keeps the last 7 days of reports around
unless I choose to move them for some reason and are entirely expendable since
thay can be easily recreated from the argus archive data.
	The Surrey campus which is currently on the end of an independent
10 meg service runs an all in one argus instance on a P3 1 gig PC. Because the
link is only 10 full we haven't separated sensor and archive host so the entire
process including post processing runs on that one box (so far without apparant
problem). When the dark fibre appears, it will move behind our border router
and become part of the main argus instance and the local one will either go
away or be upgraded to a split sensor / archiver at gig (because the link will
go to gig) if they choose and still want to see whats going on on their link
out.

Peter Van Epp / Operations and Technical Support 
Simon Fraser University, Burnaby, B.C. Canada