Argus 3.0.6 and dnaclusters

Craig Merchant cmerchant at responsys.com
Thu Dec 13 20:11:42 EST 2012


So, I compiled 3.0.7 and made the changes to ArgusSource.c.

If I run it on eth0, I get the following (and CPU is low):

[root at ids01-dc1 bin]# argus -d -i eth0
argus[16323]: 14 Dec 12 00:21:26.394339 started
argus[16323]: 14 Dec 12 00:21:26.400452 ArgusGetInterfaceStatus: interface eth0 is up

I run Chris' modified version of pfdnacluster_master:

pfdnacluster_master -d -c 10 -r 0 -n 18 -m 0 -A 1 -i dna0

Snort sees traffic and so does tcpdump.  Although once a process has connected to a dnacluster:X at Y interface, stopping or killing the process makes that interface unavailable and pfdnacluster_master needs to be restarted.

If I run argus on dnacluster:10 at 18, I don't see the "interface X is up" message:

[root at ids01-dc1 bin]# ./argus -d -i dnacluster:10 at 18
argus[8153]: 14 Dec 12 00:40:13.189119 started

The CPU runs at 100%.  ra -S 10.0.0.1:561 doesn't return any flows.

I ran Chris script and polls came back zero:

-r--r--r-- 1 root root 0 Dec 14 01:07 9752-none.305
-r--r--r-- 1 root root 0 Dec 14 01:07 9768-none.306
-r--r--r-- 1 root root 0 Dec 14 01:07 9784-none.307
-r--r--r-- 1 root root 0 Dec 14 01:07 9800-none.308
-r--r--r-- 1 root root 0 Dec 14 01:07 9816-none.309

[root at ids01-dc1 ~]# ./check_script /proc/net/pf_ring/9816-none.309
2012-12-14 01:08:00 - Polls: 0, Polls/s: 0
2012-12-14 01:08:10 - Polls: 0, Polls/s: 0
2012-12-14 01:08:20 - Polls: 0, Polls/s: 0

Any ideas why Argus doesn't seem to be able to bring up the dnacluster interface?

Thanks!

Craig

-----Original Message-----
From: Chris Wakelin [mailto:c.d.wakelin at reading.ac.uk] 
Sent: Thursday, December 13, 2012 3:45 PM
To: Carter Bullard
Cc: Craig Merchant; Argus (argus-info at lists.andrew.cmu.edu)
Subject: Re: [ARGUS] Argus 3.0.6 and dnaclusters

I think it *is* selectable. PF_RING keeps a num_poll_calls count per process, and it's managing 1.2m per second. What exactly it's counting, I'm not sure!

I've got a little shell script to track the rate:

> #!/bin/sh
> POLLS=0
> while true; do 
>   DATE=`date '+%Y-%m-%d %H:%M:%S'`
>   REPORT=`cat $@ | gawk -F":" '/^Num Poll Calls/{polls+=$2}END{print polls ","}'`
>   NPOLLS=${REPORT%%,*}
>   echo "$DATE - Polls: $NPOLLS, Polls/s: $((($NPOLLS-$POLLS)/10))"
>   POLLS=$NPOLLS
>   sleep 10
> done

and used with something like "./pf_ring_polls.sh /proc/net/pf_ring/27702-none.41"

Craig, it would be interesting to know what you see?

Best Wishes,
Chris

On 13/12/12 23:36, Carter Bullard wrote:
> Hey Chris,
> argus should be getting its packets using the routine 
> ArgusGetPacket(), reading packets from a " notselectable " interface, which starts on line 3823 in ArgusSource.c.
> 
> So, argus should try to read 4 packets, using pcap_next_ex(), if its 
> there,
> pcap_dispatch() it its not, and if we don't get any packets (pkts == 
> 0), then we're suppose to call nanosleep(), for 25 mSecs,  on line 3820.
> 
> Interesting that it never hits this call ?
> 
> Carter
> 
> 
> On Dec 13, 2012, at 5:37 PM, Chris Wakelin <c.d.wakelin at reading.ac.uk> wrote:
> 
>> Yes it's a bug in DNA. I can't remember seeing a commit that claimed 
>> to fix it; the last I saw, I think on the topic was the developer's 
>> reply to
>>
>> http://listgateway.unipi.it/pipermail/ntop-misc/2012-September/003279
>> .html
>>
>> (and IPv6 is fine now BTW :-) )
>>
>> As far as I remember, for some reason Bro IDS manages to use select() 
>> without hitting the problem, I think, perhaps by adding empty 
>> select() calls with a timeout:
>>
>> From Bro's IOSource.cc:
>>
>>>        if ( all_idle )
>>>                {
>>>                // Interesting: when all sources are dry, simply sleeping a
>>>                // bit *without* watching for any fd becoming ready may
>>>                // decrease CPU load. I guess that's because it allows
>>>                // the kernel's packet buffers to fill. - Robin
>>>                timeout.tv_sec = 0;
>>>                timeout.tv_usec = 20; // SELECT_TIMEOUT;
>>>                select(0, 0, 0, 0, &timeout);
>>>                }
>>
>> I had a go at doing that in ARGUS but it made no difference (perhaps 
>> I put it in the wrong place!).
>>
>> I'm happy to try things out on the test server, now I've updated 
>> everything (I'm using tcpreplay of a 10GB pcap over a 1Gb link from 
>> another machine using Intel e1000e cards and the time-limited DNA 
>> demo licence, so I can only test for 5 mins at a time).
>>
>> Best Wishes,
>> Chris
>>
>> On 13/12/12 22:11, Carter Bullard wrote:
>>> If I remember, the 100% CPU was a bug in the DNA code itself?
>>> Was there a resolution to that?
>>> If you would be a guinea pig, we can play around with it?
>>>
>>> Carter
>>>
>>>
>>> On Dec 13, 2012, at 4:30 PM, Chris Wakelin <c.d.wakelin at reading.ac.uk> wrote:
>>>
>>>> I've just tried 3.0.7.2 with latest PF_RING svn (post v5.5.1) and 
>>>> DNA clusters on a test machine. It looks like we do still need the 
>>>> name change (added "dna" to the list of interfaces that includes 
>>>> "dag" and
>>>> "napa") and it still uses 100% of CPU, but otherwise appears to work.
>>>>
>>>> Best Wishes,
>>>> Chris
>>
>> --
>> --+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+-
>> Christopher Wakelin,                           c.d.wakelin at reading.ac.uk
>> IT Services Centre, The University of Reading,  Tel: +44 (0)118 378 8439
>> Whiteknights, Reading, RG6 2AF, UK              Fax: +44 (0)118 975 3094
>>
> 


-- 
--+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+-
Christopher Wakelin,                           c.d.wakelin at reading.ac.uk
IT Services Centre, The University of Reading,  Tel: +44 (0)118 378 8439
Whiteknights, Reading, RG6 2AF, UK              Fax: +44 (0)118 975 3094



More information about the argus mailing list