Argus memory issues

Mon Aug 20 16:05:10 EDT 2007

Hey Peter,
Oh lets adjust the timeout values, with all things back as you would
normally run them.  The goal here is to understand if your just churning
through a lot of very short lived flows?  The current timeouts are  
pretty
long (30 secs) and so if your getting 200K flows per second of low  
volume
flows, then this should bring you back into a healthy range.

If this is useful, then the workaround is very easy, as I can
put in the logic to give flows with (pkts < 3) a zero timeout value,  
which
should get your memory back.  That is a much easier fix than to enforce
a small memory foot print.

So in this strategy, argus would hold any flow for the status interval,
hopefully that is a low number (5 secs is good, as 90% of flows live
less than 2.5 seconds), and then for low volume flows, we immediately
deallocate the flow cache.

If that is not good enough, we move to the next step, which is to have
different memory strategies for different flow types, currently we have
only one big flow cache no matter what happens.

Carter

On Aug 20, 2007, at 3:19 PM, Peter Van Epp wrote:

> On Mon, Aug 20, 2007 at 02:54:42PM -0400, Carter Bullard wrote:
>> Hey Peter,
>> Ok, I'm back.  So if its timeout issues that may help, lets modify  
>> some
>> timeout values to see if we get better results.  All the timeout
>> constants
>> are in the file ./argus/ArgusModeler.h.  Why don't we lower the  
>> timeout
>> for the UDP traffic (its generally classified as IP traffic).
>>
>> Set the ARGUS_IPTIMEOUT constant to 0.  That should definitely have
>> a significant effect.
>>
>> Lets do this with stock argus, but without .threads.
>>
>> Hope all is most excellent, and thanks for doing so much testing.
>>
>> Carter
>>
>
> 	No problem on the testing, its in my best interest to get it running
> well :-).
> 	I just now got back from pointless meetings and restarted with
>
> ARGUS_FLOW_KEY="LAYER_3_MATRIX"
>
> 	The output now looks a little more reasonable (if one sided :-)):
>
> 07-08-20 12:03:50  e          ip        85.66.183.19           - 
> >     142.58.214.209               1        0           
> 145            0   UNK
> 07-08-20 12:03:50  e          ip      142.58.241.237           - 
> >     209.73.191.242               1        0            
> 60            0   UNK
> 07-08-20 12:03:50  e          ip         142.58.12.2           - 
> >    122.152.181.170               5        0           
> 300            0   UNK
> 07-08-20 12:03:50  e          ip        86.59.11.162           - 
> >       142.58.111.1               3        0           
> 192            0   UNK
> 07-08-20 12:03:50  e          ip       83.15.162.226           - 
> >       199.60.7.184               1        0            
> 72            0   UNK
> 07-08-20 12:03:50  e          ip       192.75.243.62           - 
> >       64.92.199.73              14        0          
> 1293            0   UNK
> 07-08-20 12:03:50  e          ip        142.58.111.1           - 
> >       86.59.11.162               3        0           
> 186            0   UNK
> 07-08-20 12:03:50  e          ip       65.94.166.238           - 
> >      142.58.50.182               1        0            
> 60            0   UNK
> 07-08-20 12:03:50  e          ip        142.58.103.1           - 
> >     209.92.188.205               1        0           
> 102            0   UNK
> 07-08-20 12:03:50  e          ip        199.60.7.184           - 
> >      83.15.162.226               1        0            
> 60            0   UNK
> 07-08-20 12:03:50  e          ip       61.199.200.86           - 
> >      206.12.16.179               4        0           
> 331            0   UNK
>
> 	It may not have been obvious from the relatively small sample, but
> all the output the first time was the identical flow, there was no  
> other
> traffic visable. This time it looks more reasonable. Do you want me to
> let this run for a while and see what happens with memory or switch to
> adjusting the timing values?
> 	Memory use over 5 minutes or so looks very reasonable so far:
>
> root     25451  7.3  0.2  15068 10416 ?        SLs  12:03   0:00  
> argus -d -P 560 -i eth0 -i eth1 -U 512 -m -F /scratch/argus.conf
> root     25453  0.0  0.0   3132   832 pts/0    S+   12:03   0:00  
> grep argus
> hcids:/scratch # !ps
> ps auxwwww | grep argus
> root     25451  7.1  0.7  36032 31280 ?        SLs  12:03   0:20  
> argus -d -P 560 -i eth0 -i eth1 -U 512 -m -F /scratch/argus.conf
> root     25455  0.0  0.0   3132   832 pts/0    S+   12:08   0:00  
> grep argus
>
> still growing relatively slowly:
>
> hcids:/scratch # ps auxwwww | grep argus
> root     25451  7.5  0.8  40144 35276 ?        SLs  12:03   1:04  
> argus -d -P 560 -i eth0 -i eth1 -U 512 -m -F /scratch/argus.conf
> root     25486  0.0  0.0   3132   832 pts/0    S+   12:17   0:00  
> grep argus
>
> Peter Van Epp / Operations and Technical Support
> Simon Fraser University, Burnaby, B.C. Canada
>