Argus 3.0.6 and dnaclusters

Thu Dec 13 18:45:12 EST 2012

I think it *is* selectable. PF_RING keeps a num_poll_calls count per
process, and it's managing 1.2m per second. What exactly it's counting,
I'm not sure!

I've got a little shell script to track the rate:

> #!/bin/sh
> POLLS=0
> while true; do 
>   DATE=`date '+%Y-%m-%d %H:%M:%S'`
>   REPORT=`cat $@ | gawk -F":" '/^Num Poll Calls/{polls+=$2}END{print polls ","}'`
>   NPOLLS=${REPORT%%,*}
>   echo "$DATE - Polls: $NPOLLS, Polls/s: $((($NPOLLS-$POLLS)/10))"
>   POLLS=$NPOLLS
>   sleep 10
> done

and used with something like "./pf_ring_polls.sh
/proc/net/pf_ring/27702-none.41"

Craig, it would be interesting to know what you see?

Best Wishes,
Chris

On 13/12/12 23:36, Carter Bullard wrote:
> Hey Chris,
> argus should be getting its packets using the routine ArgusGetPacket(), reading
> packets from a " notselectable " interface, which starts on line 3823 in ArgusSource.c.
> 
> So, argus should try to read 4 packets, using pcap_next_ex(), if its there, 
> pcap_dispatch() it its not, and if we don't get any packets (pkts == 0), then
> we're suppose to call nanosleep(), for 25 mSecs,  on line 3820.  
> 
> Interesting that it never hits this call ?
> 
> Carter
> 
> 
> On Dec 13, 2012, at 5:37 PM, Chris Wakelin <c.d.wakelin at reading.ac.uk> wrote:
> 
>> Yes it's a bug in DNA. I can't remember seeing a commit that claimed to
>> fix it; the last I saw, I think on the topic was the developer's reply to
>>
>> http://listgateway.unipi.it/pipermail/ntop-misc/2012-September/003279.html
>>
>> (and IPv6 is fine now BTW :-) )
>>
>> As far as I remember, for some reason Bro IDS manages to use select()
>> without hitting the problem, I think, perhaps by adding empty select()
>> calls with a timeout:
>>
>> From Bro's IOSource.cc:
>>
>>>        if ( all_idle )
>>>                {
>>>                // Interesting: when all sources are dry, simply sleeping a
>>>                // bit *without* watching for any fd becoming ready may
>>>                // decrease CPU load. I guess that's because it allows
>>>                // the kernel's packet buffers to fill. - Robin
>>>                timeout.tv_sec = 0;
>>>                timeout.tv_usec = 20; // SELECT_TIMEOUT;
>>>                select(0, 0, 0, 0, &timeout);
>>>                }
>>
>> I had a go at doing that in ARGUS but it made no difference (perhaps I
>> put it in the wrong place!).
>>
>> I'm happy to try things out on the test server, now I've updated
>> everything (I'm using tcpreplay of a 10GB pcap over a 1Gb link from
>> another machine using Intel e1000e cards and the time-limited DNA demo
>> licence, so I can only test for 5 mins at a time).
>>
>> Best Wishes,
>> Chris
>>
>> On 13/12/12 22:11, Carter Bullard wrote:
>>> If I remember, the 100% CPU was a bug in the DNA code itself?
>>> Was there a resolution to that?
>>> If you would be a guinea pig, we can play around with it?
>>>
>>> Carter 
>>>
>>>
>>> On Dec 13, 2012, at 4:30 PM, Chris Wakelin <c.d.wakelin at reading.ac.uk> wrote:
>>>
>>>> I've just tried 3.0.7.2 with latest PF_RING svn (post v5.5.1) and DNA
>>>> clusters on a test machine. It looks like we do still need the name
>>>> change (added "dna" to the list of interfaces that includes "dag" and
>>>> "napa") and it still uses 100% of CPU, but otherwise appears to work.
>>>>
>>>> Best Wishes,
>>>> Chris
>>
>> -- 
>> --+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+-
>> Christopher Wakelin,                           c.d.wakelin at reading.ac.uk
>> IT Services Centre, The University of Reading,  Tel: +44 (0)118 378 8439
>> Whiteknights, Reading, RG6 2AF, UK              Fax: +44 (0)118 975 3094
>>
> 

-- 
--+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+-
Christopher Wakelin,                           c.d.wakelin at reading.ac.uk
IT Services Centre, The University of Reading,  Tel: +44 (0)118 378 8439
Whiteknights, Reading, RG6 2AF, UK              Fax: +44 (0)118 975 3094