Big O Impact of Filters

Jason dn1nj4 at gmail.com
Mon May 19 02:40:27 EDT 2014


#0  0x00007ffff7349e08 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007ffff734b496 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007ffff734df95 in malloc () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x000000000044c276 in ArgusMalloc (bytes=24328) at ./argus_util.c:21779
#4  0x000000000049388a in ArgusSortQueue (sorter=0x1c7feb40,
queue=0xfdf250) at ./argus_client.c:15390
#5  0x0000000000404820 in RaParseComplete (sig=0) at ./racluster.c:277
#6  0x0000000000407cf4 in main (argc=66, argv=0x7fffffffd828) at
./argus_main.c:390


On Fri, May 16, 2014 at 10:20 AM, Carter Bullard <carter at qosient.com> wrote:

> Hey Jason,
> Thanks for testing this.  Any chance you can run using gdb to see where its
> running into trouble ???  To compile with symbols:
>
>    % touch .devel
>    % ./configure
>    % make clean
>    % make
>
> If it breaks in with the same error, then type
>
>    (gdb) where
>
> That should be very helpful !!!!
>
> Carter
>
> On May 16, 2014, at 8:42 AM, Jason <dn1nj4 at gmail.com> wrote:
>
> I'm testing the 3.0.2.27 now.  Duplicating the original test in this
> thread produced much more reasonable results.  When I run against a larger
> test data set though (around 40 input files), I am getting the following
> error:
>
> *** glibc detected *** racluster: corrupted double-linked list:
> 0x000000001e900470 ***
>
> The error is the same each time I run the test.
>
>
>
> On Thu, May 15, 2014 at 10:40 PM, Carter Bullard <carter at qosient.com>wrote:
>
>> Hey Jason,
>> So I uploaded argus-clients-3.0.7.27 that has a complete fix in
>> for the problem you reported.  FYI, the problem was that we were
>> calling the queue timeout management routines on every flow,
>> which, interestingly, really crushed the routine when the idle
>> timers and status timers were both turned on, both not in the
>> same order of magnitude, and a specific filter gets a large
>> number of hits in a short period of time...
>>
>> That of course is / was really stupid, not really a bug, but kinda of a
>> bug.
>>
>> OK, the fix that is now in has independent logic for managing the idle
>> and status timeouts. Each filter entry get a complete aggregation
>> engine, and processing queue, so we can use an efficient idle timeout
>> processing strategy, but we need to process the status timeouts
>> independently, which we now do once every second.
>>
>> Hopefully things are working better for you now.
>>
>> Carter
>>
>> On May 15, 2014, at 11:30 AM, Carter Bullard <carter at qosient.com> wrote:
>>
>> Hey Jason,
>> Could you give this version of racluster() a run to see if it does
>> what you want ???  The principal difference is that the output of
>> this new racluster() will have records a bit more out of order
>> that the other version.
>>
>> With streaming data, you may not get status reports timely (like
>> within 0.25 seconds of the status timer expiration) but you will
>> get correct status record reporting, driven by the idle timeout
>> period.  I’ll improve this behavior later today.
>>
>> Sorry for any inconvenience, and thanks for pushing on this !!!!
>>
>> Carter
>>
>> <racluster.c>
>>
>> On May 15, 2014, at 10:20 AM, Carter Bullard <carter at qosient.com> wrote:
>>
>> Hey Jason,
>> Found the problem, and its a poor design assumption on my part.
>> Its a kind of a thrash between the status timer and the idle timer.
>> This does not affect rabins() or radium(), just racluster().
>>
>> Fixing it now.
>>
>> Carter
>>
>> On May 14, 2014, at 5:53 PM, Jason <dn1nj4 at gmail.com> wrote:
>>
>> Hi Carter,
>>
>> So I asked a very similar question last year (
>> http://comments.gmane.org/gmane.network.argus/9110), but I can't seem to
>> find a response.  I apologize if I'm just missing something or have just
>> forgotten.
>>
>> I am trying once again to understand why there is such a significant
>> impact on the length of time it takes to run racluster when leveraging
>> filters.  Here is the racluster.conf file I am testing:
>>
>> filter="udp and port domain" model="saddr daddr proto sport dport"
>> status=600 idle=10
>> filter="udp" model="saddr daddr proto sport dport" status=600 idle=60
>> filter="" model="saddr daddr proto sport dport" status=600 idle=600
>>
>> And here are two runs against a single argus file.  The only difference
>> is whether or not I'm using the racluster.conf:
>>
>> $ time racluster -f racluster.conf -r infile.bin -w outfile.bin -M rmon
>> -u -c "," -m saddr proto sport dport -L0 -Z s -s stime saddr proto sport
>> dport sbytes runtime dbytes trans state - not arp
>>
>> real    2m42.935s
>> user    2m39.274s
>> sys     0m3.288s
>>
>> $ time racluster -r infile.bin -w outfile.bin -M rmon -u -c "," -m saddr
>> proto sport dport -L0 -Z s -s stime saddr proto sport dport sbytes runtime
>> dbytes trans state - not arp
>>
>> real    0m1.054s
>> user    0m0.944s
>> sys     0m0.108s
>>
>> Why does the filtered option take exponentially longer?
>>
>> Thanks!
>> Jason
>>
>>
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20140519/1c58a160/attachment.html>


More information about the argus mailing list