racluster crashes, and memory utilization

Mon Feb 26 16:53:12 EST 2007

That may well work ok for me, provided I can still process the end result 
aggregate file using 'ra' on my 1GB-of-RAM box.  =)

If not, I would appreciate feedback on how I could estimate the amount of 
memory required for processing with 'racluster' and associated argus 
tools, given a known number of source hosts and total bandwidth 
consumption.

I will check for the new RC and try it out when available.

_____________________________________________________
 Michael Hornung          Computing & Communications 
 hornung at washington.edu   University of Washington

On Mon, 26 Feb 2007 at 16:29, Carter Bullard wrote:

|So I have a workaround, and I'd like to get some feedback.
|I am going to propose a multi-stage process, but hopefully it
|won't be to complex to be useful.
|
|I have added the "-M rmon" option to ra(), which will enable:
|
|  ra -r files -M rmon -w - | rasplit -M count 1M -w /tmp/tmpdir/\$snet
|
|This will generate a good number of files in the /tmp/tmpdir directory,
|(use any name you like).  These files will be named for the network
|address of the population of data in the files, that are no bigger than
|1 million records long.  If the data files are large, then additional
|files with ".aaa" extensions will be build.  Then to aggregate:
|
|  racluster -m saddr -M Replace -R /tmp/tmpdir
|
|This will aggregate each of the intermediate files and replace them.
|Use the -V flag to get racluster() to tell you what it did with each file.
|
|To realize a single file with the aggregated data:
|
|  ra -R /tmp/tmpdir -w /tmp/output.file
|
|These will probably be sorted in string numeric order, but hopefully
|that will be ok?
|
|So, how does that sound as a workaround?  Can't currently do it with
|the existing client distribution, so, ..., it should be up later tonight.
|
|Carter
|
|
|
|On Feb 26, 2007, at 3:34 PM, Carter Bullard wrote:
|
|> Hey Michael,
|> racluster() can be a hog when it comes to memory, and 1G isn't really enough
|> to aggregate all the IP addresses in a good network.  Now this may seem
|> perplexing, but when racluster() aggregates, each flow, which in this case,
|> is each IP address, tracks an amazing amount of information, possibly too
|> much information in this case.  And this problem is a thorn for a lot of
|> people
|> on the list.
|> 
|> It will take me a while to restructure the methods that racluster() uses, to
|> minimize the amount of memory, but it is of course doable.  If we have a
|> set of "what are we trying to do" with the output, then that may help in
|> getting
|> an implementation out the door quickly.
|> 
|> Same thing applies to sorting, so, ....., this is not a bad topic for the
|> list.
|> 
|> I'm working on a workaround for ip addresses right now, so let me give you
|> an update in a few days.  The workaround is to use rasplit() to write out the
|> "-M rmon" data out into a number of files that are labeled using the network
|> address.  this will give us a set of files you can aggregate, and then you
|> can
|> merge the data back, to have a single aggregated file.
|> 
|> Hopefully this will work, and I'll have it up on the server tonight.
|> 
|> Carter
|> 
|> 
|> On Feb 26, 2007, at 2:10 PM, Michael Hornung wrote:
|> 
|> > Hi, new member to the list here and a new argus user.  I'm running an argus
|> > probe and a separate collector, retrieving info via a SASL connection
|> > between the two, and the collector is writing files to disk.
|> > 
|> > My collector is OpenBSD 4.0 on a P4 2.8Ghz with 1GB physical RAM and many
|> > times that in swap.  I'm running RC39.  When I try to combine several logs'
|> > worth of data (the log being archived when it reaches a given size) into
|> > one argus stream using 'racluster' I continually run out of memory when I
|> > do not expect to.  See an example:
|> > 
|> > % ls -l ../archive/20070226-[45]
|> > -rw-r--r--  1 argus  argus  287077860 Feb 26 09:50 ../archive/20070226-4
|> > -rw-r--r--  1 argus  argus  295809628 Feb 26 10:00 ../archive/20070226-5
|> > 
|> > % racluster -M rmon -m saddr -r ../archive/20070226-[45] -w clustered
|> > racluster[11726]: 11:04:40.200048 ArgusMallocListRecord ArgusMalloc Cannot
|> > allocate memory
|> > racluster[11726]: 11:04:40.200563 ArgusNewSorter ArgusCalloc error Cannot
|> > allocate memory
|> > Segmentation fault (core dumped)
|> > 
|> > (gdb) bt
|> > #0  0x1c04473b in ArgusNewSorter ()
|> > #1  0x1c00262f in RaParseComplete ()
|> > #2  0x1c003b92 in ArgusShutDown ()
|> > #3  0x1c028d69 in ArgusLog ()
|> > #4  0x1c028b9e in ArgusMallocListRecord ()
|> > #5  0x1c03afe2 in ArgusCopyRecordStruct ()
|> > #6  0x1c002d0a in RaProcessThisRecord ()
|> > #7  0x1c0029f5 in RaProcessRecord ()
|> > #8  0x1c01adb8 in ArgusHandleDatum ()
|> > #9  0x1c038713 in ArgusReadStreamSocket ()
|> > #10 0x1c038adb in ArgusReadFileStream ()
|> > #11 0x1c003829 in main ()
|> > 
|> > Any ideas?  Thanks.
|> > 
|> > _____________________________________________________
|> > Michael Hornung          Computing & Communications
|> > hornung at washington.edu   University of Washington
|> > 
|> 
|> Carter Bullard
|> CEO/President
|> QoSient, LLC
|> 150 E. 57th Street Suite 12D
|> New York, New York 10022
|> 
|> +1 212 588-9133 Phone
|> +1 212 588-9134 Fax
|> 
|> 
|
|Carter Bullard
|CEO/President
|QoSient, LLC
|150 E. 57th Street Suite 12D
|New York, New York 10022
|
|+1 212 588-9133 Phone
|+1 212 588-9134 Fax
|
|