best way to process data from multiple hosts?

Mon Dec 8 11:18:31 EST 2008

Carter Bullard wrote:
> Hey Ken,
> Try using these two programs, rasplit() and/or rastream().
> Both programs split an incoming argus stream, and you can
> set it up to split based on the argus data source id, which should
> give you your separation.
> 
> Focusing on rasplit(),  this is what I do on all my collection sinks.
> 
>    rasplit -S radii:561 -M time 5m -w 
> /path/to/archive/\$srcid/%Y/%m/%d/argus.%Y.%m.%d.%H.%M.%S
> 
> This will take in a generic stream, and split the data into
> an "argus source id" rooted, time based file structure, where
> each file represents the data in a given 5 minute time span.
> As time goes on, rasplit() creates new files, so you're archive
> grows, as needed.
> 
> Rasplit() will break records across time boundaries, so that
> the stats are preserved correctly when graphing, processing,
> analyzing, whatever.
> 
> Because rasplit() can connect to up to, what is it, 64 remote sources,
> and because they can all be argi, or radii, or a mix of the two, you don't
> have to worry about the collection tree structure so much.
> 
> So I recommend, at first, one radium() to connect to all your sources,
> and one rasplit() to connect to your radium().  Where the radium
> resides is not important, as long as you have resources.   But having
> radium() and rasplit() on the same machine has its advantages, in
> terms of reliability, performance, etc....  When you have
> dozens of programs reading the data from radium() at once, then
> having them local becomes more important.

Thanks for the info. I'll give radium a try.

fwiw, I was experimenting with using "rasplit -d -S $source" to connect 
directly to the source (without radium). I encountered a problem where 
rasplit doesn't die without 'kill -9'. After a 'kill -9', ragraph can no 
longer read the rasplit generated log file beyond the time when rasplit 
was killed. It looks like a partial 'UNK' record corrupts the file.

Thanks,
Ken

> 
> rastream() is just rasplit(), but it  can run scripts against the 
> archive files
> after some hold time period (-B option).  If you know that all the records
> for a given 5 minute time period have finally shown up, then you can
> process the argus data file, (i.e. aggregate it, generate alarms and
> alerts against it, compress it, index it, whatever), using the script
> provided on the command line.
> 
> Rastream() cannot be used very well with Netflow records, as they have
> a bad habit of not coming out of the router when you would like, so
> use rasplit() if you are also collecting Netflow records.
> 
> Carter
> 
> On Dec 7, 2008, at 10:07 PM, Ken Anderson wrote:
> 
>> Hello,
>> I'm new to argus. ragraph and racluster are very cool!
>>
>> Currently, I have argus running on 7 or 8 machines listening on port
>> 561. I would like to monitor these on a single machine and keep each
>> server's argus logs separate.
>>
>> Is radium capable of opening 1 log for each RADIUM_ARGUS_SERVER? That
>> would be nice, I think. I could run rasplit -d for each stream, or run
>> multiple instances of radium, I suppose.
>> Is there a better way I've overlooked?
>>
>> Thanks for any ideas,
>> Ken
>>
>>
>> -- 
>> Ken Anderson
>> Pacific.Net
>>
>>
> 

-- 
Ken Anderson
http://www.pacific.net/
(707) 468-1005