Big O Impact of Filters

Carter Bullard carter at qosient.com
Mon May 19 11:27:07 EDT 2014


Hey Jason,
Sorry for the barrage of email.
Could you try this patch ??  Seems that it may help a little here.

==== //depot/argus/clients/clients/racluster.c#87 - /Volumes/Users/carter/argus/clients/clients/racluster.c ====
308c308
<          if (ArgusSorter != NULL)
---
>          if (ArgusSorter != NULL) {
309a310,311
>             ArgusSorter = NULL;
>          }


Carter


On May 19, 2014, at 11:21 AM, Carter Bullard <carter at qosient.com> wrote:

> Hey Jason,
> If you run with debug level 1, you’ll see the files as they are being
> processed, and that can show you which file is the culprit.
> If it is the last one, which is looks like it is, it maybe that one
> of the threads has shutdown / deleted a construct that is needed,
> like the memory manager.  This is a threads issue, so I’m going down
> that path to solve this problem.
> 
> If you see anything that suggests otherwise, like its not the last file,
> send a note, if you have the time …
> 
> Thanks for all the help !!!
> Carter
> 
> On May 19, 2014, at 9:25 AM, Carter Bullard <carter at qosient.com> wrote:
> 
>> Hey Jason,
>> Is it always the same file?   any chance it would fail on just that file ??
>> If you run with "-M ind", does the problem go away ??  This option forces aggregation to be limited on each file...
>> 
>> Carter
>> 
>> Carter Bullard, QoSient, LLC
>> 150 E. 57th Street Suite 12D
>> New York, New York 10022
>> +1 212 588-9133 Phone
>> +1 212 588-9134 Fax
>> 
>> On May 19, 2014, at 2:40 AM, Jason <dn1nj4 at gmail.com> wrote:
>> 
>>> #0  0x00007ffff7349e08 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
>>> #1  0x00007ffff734b496 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
>>> #2  0x00007ffff734df95 in malloc () from /lib/x86_64-linux-gnu/libc.so.6
>>> #3  0x000000000044c276 in ArgusMalloc (bytes=24328) at ./argus_util.c:21779
>>> #4  0x000000000049388a in ArgusSortQueue (sorter=0x1c7feb40, queue=0xfdf250) at ./argus_client.c:15390
>>> #5  0x0000000000404820 in RaParseComplete (sig=0) at ./racluster.c:277
>>> #6  0x0000000000407cf4 in main (argc=66, argv=0x7fffffffd828) at ./argus_main.c:390
>>> 
>>> 
>>> On Fri, May 16, 2014 at 10:20 AM, Carter Bullard <carter at qosient.com> wrote:
>>> Hey Jason,
>>> Thanks for testing this.  Any chance you can run using gdb to see where its
>>> running into trouble ???  To compile with symbols:
>>> 
>>>    % touch .devel
>>>    % ./configure
>>>    % make clean
>>>    % make
>>> 
>>> If it breaks in with the same error, then type
>>> 
>>>    (gdb) where
>>> 
>>> That should be very helpful !!!!
>>> 
>>> Carter
>>> 
>>> On May 16, 2014, at 8:42 AM, Jason <dn1nj4 at gmail.com> wrote:
>>> 
>>>> I'm testing the 3.0.2.27 now.  Duplicating the original test in this thread produced much more reasonable results.  When I run against a larger test data set though (around 40 input files), I am getting the following error: 
>>>> 
>>>> *** glibc detected *** racluster: corrupted double-linked list: 0x000000001e900470 ***
>>>> 
>>>> The error is the same each time I run the test.
>>>> 
>>>> 
>>>> 
>>>> On Thu, May 15, 2014 at 10:40 PM, Carter Bullard <carter at qosient.com> wrote:
>>>> Hey Jason,
>>>> So I uploaded argus-clients-3.0.7.27 that has a complete fix in
>>>> for the problem you reported.  FYI, the problem was that we were
>>>> calling the queue timeout management routines on every flow,
>>>> which, interestingly, really crushed the routine when the idle
>>>> timers and status timers were both turned on, both not in the
>>>> same order of magnitude, and a specific filter gets a large
>>>> number of hits in a short period of time...
>>>> 
>>>> That of course is / was really stupid, not really a bug, but kinda of a bug. 
>>>> 
>>>> OK, the fix that is now in has independent logic for managing the idle
>>>> and status timeouts. Each filter entry get a complete aggregation
>>>> engine, and processing queue, so we can use an efficient idle timeout
>>>> processing strategy, but we need to process the status timeouts 
>>>> independently, which we now do once every second.
>>>> 
>>>> Hopefully things are working better for you now.
>>>> 
>>>> Carter
>>>> 
>>>> On May 15, 2014, at 11:30 AM, Carter Bullard <carter at qosient.com> wrote:
>>>> 
>>>>> Hey Jason,
>>>>> Could you give this version of racluster() a run to see if it does
>>>>> what you want ???  The principal difference is that the output of
>>>>> this new racluster() will have records a bit more out of order
>>>>> that the other version.  
>>>>> 
>>>>> With streaming data, you may not get status reports timely (like
>>>>> within 0.25 seconds of the status timer expiration) but you will
>>>>> get correct status record reporting, driven by the idle timeout
>>>>> period.  I’ll improve this behavior later today.
>>>>> 
>>>>> Sorry for any inconvenience, and thanks for pushing on this !!!!
>>>>> 
>>>>> Carter
>>>>> 
>>>>> <racluster.c>
>>>>> 
>>>>> On May 15, 2014, at 10:20 AM, Carter Bullard <carter at qosient.com> wrote:
>>>>> 
>>>>>> Hey Jason,
>>>>>> Found the problem, and its a poor design assumption on my part.
>>>>>> Its a kind of a thrash between the status timer and the idle timer.
>>>>>> This does not affect rabins() or radium(), just racluster().
>>>>>> 
>>>>>> Fixing it now.
>>>>>> 
>>>>>> Carter
>>>>>> 
>>>>>> On May 14, 2014, at 5:53 PM, Jason <dn1nj4 at gmail.com> wrote:
>>>>>> 
>>>>>>> Hi Carter,
>>>>>>> 
>>>>>>> So I asked a very similar question last year (http://comments.gmane.org/gmane.network.argus/9110), but I can't seem to find a response.  I apologize if I'm just missing something or have just forgotten.
>>>>>>> 
>>>>>>> I am trying once again to understand why there is such a significant impact on the length of time it takes to run racluster when leveraging filters.  Here is the racluster.conf file I am testing: 
>>>>>>> 
>>>>>>> filter="udp and port domain" model="saddr daddr proto sport dport" status=600 idle=10
>>>>>>> filter="udp" model="saddr daddr proto sport dport" status=600 idle=60
>>>>>>> filter="" model="saddr daddr proto sport dport" status=600 idle=600
>>>>>>> 
>>>>>>> And here are two runs against a single argus file.  The only difference is whether or not I'm using the racluster.conf:
>>>>>>> 
>>>>>>> $ time racluster -f racluster.conf -r infile.bin -w outfile.bin -M rmon -u -c "," -m saddr proto sport dport -L0 -Z s -s stime saddr proto sport dport sbytes runtime dbytes trans state - not arp 
>>>>>>>  
>>>>>>> real    2m42.935s 
>>>>>>> user    2m39.274s 
>>>>>>> sys     0m3.288s 
>>>>>>>  
>>>>>>> $ time racluster -r infile.bin -w outfile.bin -M rmon -u -c "," -m saddr proto sport dport -L0 -Z s -s stime saddr proto sport dport sbytes runtime dbytes trans state - not arp 
>>>>>>>  
>>>>>>> real    0m1.054s 
>>>>>>> user    0m0.944s 
>>>>>>> sys     0m0.108s
>>>>>>> 
>>>>>>> Why does the filtered option take exponentially longer?
>>>>>>> 
>>>>>>> Thanks!
>>>>>>> Jason
>>>>>> 
>>>>> 
>>>> 
>>>> 
>>> 
>>> 
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 455 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20140519/6d6454d9/attachment.sig>


More information about the argus mailing list