racluster memory utilization

Fri May 23 05:56:43 EDT 2014

Thanks Jesper.

Indeed, the 50GB is for an extended period of time.  I was hoping to keep
as much of the aggregation as possible within argus (reducing the
possibility for introducing additional aggregation errors).  However it
seems clear that's not going to scale up to the 10x that amount of flow
data (which I'm ultimately looking to achieve) without throwing
significantly more hardware at the problem.  At this point I'm thinking my
best bet is mini-aggregations on shorter periods and then a summary
aggregation on the mini-aggregations.

And thanks for the point on the DDoS...  I hadn't really thought about a
strategy for dealing with that scenario.

Jason

Date: Fri, 23 May 2014 08:56:53 +0200

> From: Jesper Skou Jensen <jesper.skou.jensen at uni-c.dk>
> Subject: Re: [ARGUS] racluster memory utilization
> To: Argus <argus-info at lists.andrew.cmu.edu>
> Message-ID: <537EF135.8030203 at uni-c.dk>
> Content-Type: text/plain; charset="utf-8"
>
> I'm guessing the 50GB file is from an extended period of time, right?
> You could also try to split the input Argus log into x-minute segments,
> as big as your box can handle afterwards.
>
> I would split it into eg. 10 minute segments, run racluster on those
> segments and then racluster on all of the raclustered segments at the
> end. That ought to reduce the memory usage.
>
> Furthermore, if you have eg. cron-based log analysis/correlation you
> should run it under ulimit or timeout or similar. That way a sudden
> excessive Argus log (eg. from a DDoS attack) won't overwhelm your server
> and make everything crash. Maybe catch the crashes and send an alert, or
> maybe even trigger a "less ram/cpu comsuming" script with eg. a lot of
> rasplit gymnastics.
>
> Just my 5c.
>
>
> Regards
> Jesper
>
>
> On 23-05-2014 07:17, Jason wrote:
> > Thanks Carter.
> >
> > I'm pretty familiar with the data.  It's about 6% DNS.  So I'll
> > continue tweaking the conf file times and see where that leads.
> >
> > Although, I am still interested to know if you have any flow to memory
> > statistical guidelines with the default settings.
> >
> > Jason
> >
> >
> >
> > On Fri, May 23, 2014 at 12:17 AM, Carter Bullard <carter at qosient.com
> > <mailto:carter at qosient.com>> wrote:
> >
> >     Hey Jason,
> >     You really need to look at your data to see how it can best be
> >     aggregated.
> >     Do you have a lot of DNS and UDP ??  If not then your approach
> >     won?t help you
> >     much as you default to standard aggregation for everything else,
> >     and hold
> >     the transactions way too long.
> >
> >     If yes you have a lot of DNS, then your approach won?t help you
> >     much, as
> >     you?re not doing any real aggregation as you specify the standard
> >     5-tuple
> >     aggregation model.
> >
> >     Does this:
> >
> >        racluster -r < filelist > -m saddr daddr proto dport -w
> >     /tmp/dns.nsport.out - udp and port domain
> >        racount -r /tmp/dns.nsport.out
> >
> >     Go faster than
> >
> >        racluster -r <filelist > -w /tmp/dns.out - udp and port domain
> >        racount -r /tmp/dns.out
> >
> >     ???  If so, then there is a good candidate for your complex
> >     racluster.conf.
> >
> >     Write the output to a file, so you can inspect the output for
> >     relevance, correctness,
> >     all the *nesses that will tell you if any form of client / server
> >     aggregation
> >     for DNS buys you anything.
> >
> >     Try tuning down your idle time to 10-30 seconds?
> >     Carter
> >
> >
> >
> >     On May 22, 2014, at 7:46 PM, Jason <dn1nj4 at gmail.com
> >     <mailto:dn1nj4 at gmail.com>> wrote:
> >
> >>     (Changing the subject to be relevant to the current conversation)
> >>
> >>     Carter,
> >>
> >>     Based on your suggestions below, with 3.0.7.28, I conducted 2
> >>     tests against 50GB of flow files with the following:
> >>
> >>     racluster -r <filelist> -i -nn -c"," -m srcid saddr daddr proto
> >>     dport -Zb -s stime saddr daddr proto sport dport sbytes runtime
> >>     dbytes trans state
> >>
> >>     On a beefy server, I let this run for 70 minutes before killing
> >>     it.  In that time it consumed 33GB of RAM.
> >>
> >>     Next, I added the "-f racluster.conf" option with the following
> >>     configuration:
> >>
> >>     filter="udp and port domain" model="saddr daddr proto sport
> >>     dport" status=0 idle=10
> >>     filter="udp" model="saddr daddr proto sport dport" status=0 idle=60
> >>     filter="" model="saddr daddr proto sport dport" status=0 idle=600
> >>
> >>     This version (which I was expecting to consume less memory based
> >>     on previous list threads) I killed after 53 minutes with it
> >>     consuming 39GB of RAM (read: less time, more RAM).
> >>
> >>     So even with your suggested changes, the amount of RAM
> >>     utilization still seems really high.  Are there changes I should
> >>     make to the racluster.conf file to reduce the memory footprint
> >>     further?  Do you have any kind of statistics correlating volume
> >>     of flow data to volume of memory utilization?
> >>
> >>     I know you mentioned rasqlinsert, but my performance testing for
> >>     trying to process another large batch of files files indicated
> >>     the processing probably would not finish before the next batch
> >>     needed to be processed.  So I'm thinking that's not really a
> >>     viable option.
> >>
> >>     Appreciate all the help.
> >>     Jason
> >>
> >>     On Thu, May 22, 2014 at 2:33 PM, Carter Bullard
> >>     <carter at qosient.com <mailto:carter at qosient.com>> wrote:
> >>
> >>         Hey Jason,
> >>         So you want to do service based tracking on an IP address basis,
> >>         but you want to track client and server oriented stats.
> >>
> >>         Once you say that you want to track directionality, then
> >>         the ?-M rmon? option is not the correct tool.
> >>
> >>         Tracking single IP addresses and all the ports that they offer
> >>         is a great way to go, and ?-M rmon? is a good way to do that.
> >>
> >>         What do you get with "racluster -m srcid smac saddr sport? ?
> >>         You get the ethernet, IP address pairings, and any port that
> >>         is used on that IP address.  This information can answer your
> >>         server questions.  If the port of interest is in the output,
> >>         it was used on the that IP address, if there are lots of
> >>         connections,
> >>         with traffic, then you may be able to infer that it is a server
> >>         for that port, but it is not definitive.
> >>
> >>         You should use straight racluster() with a filter that
> >>         assures that your port operations are valid.
> >>
> >>            racluster -m srcid saddr daddr proto dport -r file - \(syn
> >>         or synack\)
> >>
> >>         This will give you TCP flow records where the dport is the
> >>         service port.
> >>         You will end up with a list of records that are:
> >>
> >>            client -> server.serverPort metrics
> >>
> >>
> >>         You should get yourself a good racluster.conf file and do a
> >>         decent job
> >>         on defining a cluster scheme that really works.
> >>
> >>         Carter
> >>
> >>
> >>         On May 22, 2014, at 12:58 PM, Jason <dn1nj4 at gmail.com
> >>         <mailto:dn1nj4 at gmail.com>> wrote:
> >>
> >>>         Let me clarify and provide a bit more context...  I expect
> >>>         the following flows:
> >>>
> >>>         1.2.3.4:23456 <http://1.2.3.4:23456/>->5.6.7.8:34567
> >>>         <http://5.6.7.8:34567/>
> >>>         1.2.3.4:45678 <http://1.2.3.4:45678/>->6.7.8.9:34567
> >>>         <http://6.7.8.9:34567/>
> >>>
> >>>         To result in the following output data:
> >>>
> >>>         1.2.3.4 23456 34567
> >>>         1.2.3.4 45678 34567
> >>>         5.6.7.8 34567 23456
> >>>         6.7.8.9 34567 45678
> >>>
> >>>         ((in addition to various other stats aggregated with the
> >>>         saddr,sport,dport fields as the key))
> >>>
> >>>         I'm then taking the above data and doing simplistic port
> >>>         groupings, such as "34567 is (typically) part of the app1
> >>>         port group" (think 80, 8000, 8080 as typically "web").  Then
> >>>         I generate a report that says:
> >>>
> >>>         1.2.3.4, client to the app1 port group, X bytes from this
> >>>         client, Y bytes to this client, Z connections from this client
> >>>
> >>>         5.6.7.8, server for the app1 port group, X bytes from this
> >>>         server, Y bytes to this server, Z connections to this server
> >>>
> >>>         6.7.8.9, server for the app1 port group, X bytes from this
> >>>         server, Y bytes to this server,Z connections to this server
> >>>
> >>>         This is a gross oversimplification, but is there a better
> >>>         way to do the above?
> >>>
> >>>         Thanks!
> >>>         Jason
> >>
> >>
> >
> >
>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
> https://lists.andrew.cmu.edu/mailman/private/argus-info/attachments/20140523/ab938255/attachment.html
>
> ------------------------------
>
> _______________________________________________
> Argus-info mailing list
> Argus-info at lists.andrew.cmu.edu
> https://lists.andrew.cmu.edu/mailman/listinfo/argus-info
>
>
> End of Argus-info Digest, Vol 105, Issue 45
> *******************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20140523/98913887/attachment.html>