racluster memory utilization

Thu May 22 19:46:59 EDT 2014

(Changing the subject to be relevant to the current conversation)

Carter,

Based on your suggestions below, with 3.0.7.28, I conducted 2 tests against
50GB of flow files with the following:

racluster -r <filelist> -i -nn -c"," -m srcid saddr daddr proto dport -Zb
-s stime saddr daddr proto sport dport sbytes runtime dbytes trans state

On a beefy server, I let this run for 70 minutes before killing it.  In
that time it consumed 33GB of RAM.

Next, I added the "-f racluster.conf" option with the following
configuration:

filter="udp and port domain" model="saddr daddr proto sport dport" status=0
idle=10
filter="udp" model="saddr daddr proto sport dport" status=0 idle=60
filter="" model="saddr daddr proto sport dport" status=0 idle=600

This version (which I was expecting to consume less memory based on
previous list threads) I killed after 53 minutes with it consuming 39GB of
RAM (read: less time, more RAM).

So even with your suggested changes, the amount of RAM utilization still
seems really high.  Are there changes I should make to the racluster.conf
file to reduce the memory footprint further?  Do you have any kind of
statistics correlating volume of flow data to volume of memory utilization?

I know you mentioned rasqlinsert, but my performance testing for trying to
process another large batch of files files indicated the processing
probably would not finish before the next batch needed to be processed.  So
I'm thinking that's not really a viable option.

Appreciate all the help.
Jason

On Thu, May 22, 2014 at 2:33 PM, Carter Bullard <carter at qosient.com> wrote:

> Hey Jason,
> So you want to do service based tracking on an IP address basis,
> but you want to track client and server oriented stats.
>
> Once you say that you want to track directionality, then
> the “-M rmon” option is not the correct tool.
>
> Tracking single IP addresses and all the ports that they offer
> is a great way to go, and “-M rmon” is a good way to do that.
>
> What do you get with "racluster -m srcid smac saddr sport” ?
> You get the ethernet, IP address pairings, and any port that
> is used on that IP address.  This information can answer your
> server questions.  If the port of interest is in the output,
> it was used on the that IP address, if there are lots of connections,
> with traffic, then you may be able to infer that it is a server
> for that port, but it is not definitive.
>
> You should use straight racluster() with a filter that
> assures that your port operations are valid.
>
>    racluster -m srcid saddr daddr proto dport -r file - \(syn or synack\)
>
> This will give you TCP flow records where the dport is the service port.
> You will end up with a list of records that are:
>
>    client -> server.serverPort metrics
>
>
> You should get yourself a good racluster.conf file and do a decent job
> on defining a cluster scheme that really works.
>
> Carter
>
>
> On May 22, 2014, at 12:58 PM, Jason <dn1nj4 at gmail.com> wrote:
>
> Let me clarify and provide a bit more context...  I expect the following
> flows:
>
> 1.2.3.4:23456 -> 5.6.7.8:34567
> 1.2.3.4:45678 -> 6.7.8.9:34567
>
> To result in the following output data:
>
> 1.2.3.4 23456 34567
> 1.2.3.4 45678 34567
> 5.6.7.8 34567 23456
> 6.7.8.9 34567 45678
>
> ((in addition to various other stats aggregated with the saddr,sport,dport
> fields as the key))
>
> I'm then taking the above data and doing simplistic port groupings, such
> as "34567 is (typically) part of the app1 port group" (think 80, 8000, 8080
> as typically "web").  Then I generate a report that says:
>
> 1.2.3.4, client to the app1 port group, X bytes from this client, Y bytes
> to this client, Z connections from this client
>
> 5.6.7.8, server for the app1 port group, X bytes from this server, Y bytes
> to this server, Z connections to this server
>
> 6.7.8.9, server for the app1 port group, X bytes from this server, Y bytes
> to this server,Z connections to this server
>
> This is a gross oversimplification, but is there a better way to do the
> above?
>
> Thanks!
> Jason
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20140522/7522887d/attachment.html>