Multi-Instanced Argus

Craig Merchant cmerchant at responsys.com
Fri Mar 28 16:14:54 EDT 2014


That sounds possible... Carter is probably the better person to ask.  I haven't had to recompile Argus for almost a year, so all I can say is that it's working fine for me with an older version.

Carter...  what would it take to get DNA and libzero interfaces officially supported?

Thx.

Craig

-----Original Message-----
From: Reynolds, Jeffrey [mailto:JReynolds at utdallas.edu] 
Sent: Friday, March 28, 2014 12:43 PM
To: Carter Bullard; Craig Merchant
Cc: Argus
Subject: Re: [ARGUS] Multi-Instanced Argus

Ok, I¹m almost sure there are issues with Argus and the code I¹ve modified.  To rehash, I¹ve changed line grabbed argus-3.0.7.5 and I¹ve chagned the following line in argus/ArgusSource.c

4331

- if ((strstr(device->name, "dag")) || (strstr(device->name, "napa"))) {

+ if (strstr(device->name, "dag") || strstr(device->name, "nap") ||
strstr(device->name, "dna") || (strstr(device->name, "eth") && strstr(device->name, "@"))) {

I¹ve also tried:

+ if ((strstr(device->name, "dag")) || (strstr(device->name, "nap")) ||
(strstr(device->name, "dna")) || (strstr(device->name, "eth") && strstr(device->name, "@"))) {


As I wasn¹t sure if the paren the strstr statements had to be enclosed in their own set of parens.  Anyway, in both instances, I¹ll try to run Argus and wind up with a 128 byte file.  For example:

$ argus -i dna0 -w /var/data/argus-out -s 1500 (wait about 20 seconds) $ ls -l /var/data -rw-r--r--. 1 argus argus 128 Mar 28 07:46 argus-out

When I run with the vanilla drivers, and my interface is not ³dna0² but ³em1², then I get better results.

# rmmod ixgbe
# modprobe ixgbe #pulling from /lib/modules/`uname -r`

$ rm argus-out
rm: remove regular file `argus-out'? y
$ argus -i em1 -w /var/data/argus-out -s 1500 (wait about 20 seconds) $ ls -l /var/data -rw-r--r--. 1 argus argus 2392260 Mar 28 07:46 argus-out


The real kicker seems to be in /var/log/messages.  When running argus on
em1 with the original ixgbe driver, I get the following output in
/var/log/messages:


Mar 28 05:14:52 argus argus[23142]: 28 Mar 14 05:14:52.865660 started Mar 28 05:14:52 argus argus[23142]: 28 Mar 14 05:14:52.882755 started Mar 28 05:14:52 argus kernel: device em1 entered promiscuous mode Mar 28 05:14:52 argus argus[23142]: 28 Mar 14 05:14:52.932220
ArgusGetInterfaceStatus: interface em1 is up Mar 28 05:15:18 argus argus[23142]: 28 Mar 14 05:15:18.812342 stopped


However, when running with the DNA driver, the output is as follows:

Mar 28 08:33:16 argus argus[23915]: 28 Mar 14 08:33:16.967530 started Mar 28 08:33:16 argus argus[23915]: 28 Mar 14 08:33:16.985055 started Mar 28 08:33:50 argus argus[23915]: 28 Mar 14 08:33:50.667199 stopped


Now the interface is in promiscuous mode, I can see the change in received packets rising considerably by just running ifconfig a few times.  I think that for whatever reason, the function in Argus that outputs the ³ArgusGetInterfaceStatus² line isn¹t correctly interpreting dna0 as an appropriate interface.

Does any of this sound remotely possible?

-Jeff



On 3/27/14, 7:23 PM, "Craig Merchant" <cmerchant at responsys.com> wrote:

>Hey, Jeffrey...
>
>The configuration questions for the pf_ring and ixgbe drivers may be 
>better answered on the ntop forums...  But I'll do my best.  Here is 
>how I load the drivers:
>
>    insmod /lib/modules/2.6.32-220.el6.x86_64/updates/pf_ring.ko
>    /sbin/modprobe ixgbe MQ=0,0 RSS=1,1 num_rx_slots=32768
>
>    ifconfig dna0 up promisc
>    ethtool -K dna0 tso off
>    ethtool -K dna0 gro off
>    ethtool -K dna0 lro off
>    ethtool -K dna0 gso off
>    ethtool -G dna0 tx 32768
>    ethtool -G dna0 rx 32768
>
>One thing I'm not clear on from your config is why you are using 
>pfdnacluster_master at all...  That daemon is designed to split up 
>flows and/or make copies of them to distribute to other applications.  
>I don't think it's meant to aggregate two interfaces into one stream.  
>Normally it's run with a -n parameter to tell it how many queues you 
>want traffic divided up into.  We use:
>
>pfdnacluster_master -d -c 10 -n 28,1 -m 0 -i dna0
>
>In this case, -n says "divide up one copy of the traffic into 28 queues"
>and "create one copy of all the traffic on the last queue".  The apps 
>accessing the first 28 queues (Snort) would connect to dnacluster:10 at 0 -
>dnacluster:10 at 27.   Argus connects to dnacluster:10 at 28 and would see a
>copy of all of the traffic.
>
>If all you are looking to do is combine traffic from two interfaces 
>into one, why not just run argus with -i dna0,dna1?
>
>For testing, I would try the following to see where you might be having
>problems:
>
>	pfcount -i dna0
>	pfcount -i dna1
>	pfcount -i dna0,dna1
>	pfcount -i dnacluster:10
>	pfcount -i dnacluster:10 at 0
>
>Let me know if that helps...
>
>Craig
>
>
>
>
>-----Original Message-----
>From: Reynolds, Jeffrey [mailto:JReynolds at utdallas.edu]
>Sent: Thursday, March 27, 2014 3:18 PM
>To: Craig Merchant; Carter Bullard
>Cc: Argus
>Subject: Re: [ARGUS] Multi-Instanced Argus
>
>So I understand this is from a while ago, but here is what I have.
>Craig, maybe you can show me how I'm doing it wrong.
>
>I finally got PF_Ring and libzero licensed correctly so that 
>pfdnacluster isn't limited to 5 minutes of capture.  I downloaded the 
>Argus source, installed the dependencies, and compiled after making the 
>change you noted below.  However, I don't seem to be properly attaching 
>argus to my devices to allow it to capture.  I have a feeling its 
>something to do with my PF_Ring or dna-ixgbe conf files.  We have two 
>interfaces to monitor, which I've previously combined into one by using 
>pfdnacluster_master.  However, it looks like I can't get Argus to hook 
>into that or a single dan interface.  Anyway, after make installing, I 
>run the following command with the following result:
>
>#pfdnacluster_master -i dna0,dna1 -c 10 #argus -i dnacluster:10 -s 1500 
>-w /var/data/argus-out
>
>My /var/log/messages says that the specified interface doesn't exist, 
>which I kind of expected.
>So I tried this (without pfdnacluster running):
>
>#argus -i dna0 -s 1500 -w /var/data/argus-out
>
>This time argus appears to have started, but my output file is not 
>growing (it initial starts at 128 bytes and increases by that same 
>amount every 30 seconds or so).
>
>In case this happens to be the parameters I'm loading with my kernel 
>modules, here they are:
>
>pf_ring.ko transparenet_mode=2
>(I've also tried 0, with similar results) ixgbe.ko RSS=1,1,1,1 (I 
>wasn't seeing all of the traffic from my interfaces with the default 
>config, the ntop folks recommended this, I need to dig further into the 
>docs to learn more about these parameters).
>
>To answer your original question, I'm only monitoring about ~2Gbps, 
>significantly less then you are.  I'm not sure if what I've noticed 
>would be considered "gaps", but we do see exchanges where the server 
>appears to initiate conversations by sending a response to a client, 
>which the client doesn't appear to have requested.  I'm guess the 
>missing request was most likely a packet that didn't get captured.
>
>Any configuration suggestions would be much appreciated.
>
>Thanks,
>
>Jeff
>
>
>From: Craig Merchant
><cmerchant at responsys.com<mailto:cmerchant at responsys.com>>
>Date: Wednesday, March 12, 2014 at 6:39 PM
>To: Carter Bullard <carter at qosient.com<mailto:carter at qosient.com>>, 
>Jeff Reynolds <jjr140030 at utdallas.edu<mailto:jjr140030 at utdallas.edu>>
>Cc: Argus
><argus-info at lists.andrew.cmu.edu<mailto:argus-info at lists.andrew.cmu.edu
>>>
>Subject: RE: [ARGUS] Multi-Instanced Argus
>
>We're running Argus and Snort of PF_RING's DNA/Libzero drivers.  We 
>decided to use Libzero because the standard DNA drivers limit the 
>number of memory "queues" containing network traffic to 16.  Each queue 
>can only be accessed by a single process and our sensors have 32 cores, 
>so we wouldn't be able to run the maximum number of Snort instances without it.
>
>We use the pfdnaclustermaster app to spread flows across 28 queues for 
>snort and also maintain a copy of all flows in a queue for Argus.
>
>To get it to work, all I had to do was make a slight edit to 
>ArgusSource.c so that Argus would recognize DNA/Libzero queues as a 
>valid interface.
>
>Somewhere around line 4191 (for argus 3.0.7):
>
>
>-   if ((strstr(device->name, "dag")) || (strstr(device->name, "napa"))) {
>
>+ if (strstr(device->name, "dag") || strstr(device->name, "nap") || 
>+ strstr(device->name, "dna") || (strstr(device->name, "eth") && 
>+ strstr(device->name, "@"))) {
>
>Our data centers do around 4-8 Gbps 24/7.  From what I recall, there is 
>(or was) a bug in PF_RING that caused Argus to run at 100% all of the 
>time, but in my experience Argus wasn't having problems keeping up with 
>our volume of data.  We did see an unusually high number of flows that 
>Argus couldn't determine the direction of, but we weren't seeing gaps 
>in the packets or anything else to suggest that Argus couldn't handle 
>the volume.
>
>How much traffic are you sending at Argus?  Have you tried searching 
>your Argus records for flows that have gaps in them?  That would be a 
>pretty good indicator that Argus may have trouble keeping up.  Or that 
>your SPAN port can't handle the load...
>
>Thx.
>
>Craig
>
>From: 
>argus-info-bounces+cmerchant=responsys.com at lists.andrew.cmu.edu<mailto:
>argus-info-bounces+arg
>us-info-bounces+cmerchant=responsys.com at lists.andrew.cmu.edu>
>[mailto:argus-info-bounces+cmerchant=responsys.com at lists.andrew.cmu.edu
>]
>On Behalf Of Carter Bullard
>Sent: Wednesday, March 12, 2014 1:57 PM
>To: Reynolds, Jeffrey
>Cc: Argus
>Subject: Re: [ARGUS] Multi-Instanced Argus
>
>Hey Jeffery,
>Good so far.   This seem like the link for accelerating snort with
>PF_RING DNA ??
>http://www.ntop.org/pf_ring/accelerating-snort-with-pf_ring-dna/
>
>I'm interested in the symmetric RSS and if it works properly.
>Are you running the PF_RING DNA DAQ ????
>
>It would seem that we'll have to modify argus to use this facility ???
>
>Carter
>
>On Mar 12, 2014, at 3:26 PM, Reynolds, Jeffrey 
><JReynolds at utdallas.edu<mailto:JReynolds at utdallas.edu>> wrote:
>
>
>First, before we dive into to it too deep, how is the performance ??
>
>This actually seems like a great place to start.  Before getting too 
>heavy into PF_RING integration, maybe I should offer a bit of backstory.
>Our main goal is just to archive traffic.  We have a server running 
>CentOS 6 that receives traffic from two SPAN ports.  The only thing we 
>want to accomplish is to maintain a copy of that traffic for some 
>period of time.  Argus was used because it seemed to be the best tool 
>for the price, and it comes with a lot of great features that while we 
>may not use now, we may use later (again, for right now all we want is 
>a copy of the traffic to be able to perform forensics on).
>
>Now, I put up a single instance of Argus and pointed it at the 
>interface that was the master of our two bonded physical NICs (eth0 and 
>eth1 are bonded to bond0).  I let it run for an hour to get some 
>preliminary numbers.  I ran an recount against my output file and got 
>the following
>stats:
>
>racount -t 2014y3m12d05h -r argus-out
>racount records total_pkts src_pkts dst_pkts total_bytes src_bytes 
>dst_bytes sum 14236180 187526800 98831765 88695035 212079839908
>102889789820 109190050088
>
>However, the switch the switch sending that traffic reported that it 
>had sent a total of 421,978,297 packets to both interfaces, and a total 
>of
>371,307,051,815 bytes for that time frame.  I could be interpreting 
>something incorrectly, so maybe the best first thing for me to confirm 
>is that we are in fact losing a lot of traffic.  But it seems that a 
>single argus instance can't keep up with the traffic.  I've seen this 
>happen with Snort, and our solution was to plug Snort into PF_RING to 
>allow the traffic to be intelligently forwarded via the Snort Data 
>Acquisition Library (DAQ).  From the perspective of someone who hasn't 
>had a lot of exposure to this level of hardware configuration, it was 
>relatively easy to plug the configuration parameters in at the Snort 
>command line to have them all point at the same traffic source so that 
>each individual process didn't run through the same traffic.  My hope 
>was that there might just be some parameters to set within the 
>argus.conf file which would tell each process to pull from a single 
>PF_RING source.  However, it looks like this might not be as easy as I had once thought.
>
>Am I on the right track or does this make even a little sense?
>
>Thanks,
>
>Jeff
>
>
>
>From: Carter Bullard
><carter at qosient.com<mailto:carter at qosient.com><mailto:carter at qosient.co
>m>>
>Date: Wednesday, March 12, 2014 at 9:54 AM
>To: "Reynolds, Jeffrey"
><JReynolds at utdallas.edu<mailto:JReynolds at utdallas.edu><mailto:JReynolds
>@ut
>dallas.edu>>
>Cc: Argus
><argus-info at lists.andrew.cmu.edu<mailto:argus-info at lists.andrew.cmu.edu
>><m ailto:argus-info at lists.andrew.cmu.edu>>
>Subject: Re: [ARGUS] Multi-Instanced Argus
>
>Hey Jeffrey,
>I am very interested in this approach, but I have no experience with 
>this PF_RING feature, so I'll have to give you the "design response".
>Hopefully, we can get this to where its doing exactly what anyone would 
>want it to do, and get us a really fast argus, on the cheap.
>
>First, before we dive into to it too deep, how is the performance ??  
>Are you getting bi-directional flows out of this scheme ??  Are you 
>seeing all the traffic ???  If so, then congratulations !!!  If the 
>performance is good, your seeing all the traffic, but you're only 
>getting uni-directional flows, then we may have some work to do, but 
>still congratulations !!!  If you're not getting all the traffic then 
>we have some real work to do, as one of the purposes of argus is to 
>monitor all the traffic.
>
>OK, so my understanding is that the PF_RING can do some packet routing 
>to a non-overlapping set of tap interfaces.  Routing is based on some 
>classification scheme, designed to make this usable. The purpose is to 
>provide coarse grain parallelism for packet processing.  The idea, as 
>much as I can tell, is to prevent multiple readers from having to read 
>from the same queue; eliminating locking issues, which kills 
>performance etc...
>
>So, I'm not sure what you mean by "pulling from the same queue".  If 
>you do have multiple argi reading the same packet, you will end up 
>counting a single packet multiple times.  Not a terrible thing, but not recommended.
> Its not that you're creating multiple observation domains using this 
>PF_RING technique. You're really splitting a single packet observation 
>domain into a multi-sensor facility ... eventually you will want to 
>combine the total argus output into a single output stream, that 
>represents the single packet observation domain.  At least that is my 
>thinking, and I would recommend that you use radium to connect to all 
>of your argus instances, rather than writing the argus output to a set 
>of files.  Radium will generate a single argus data output stream, 
>representing the argus data from the single observation domain.
>
>The design issue of using the PF_RING function is "how is PF_RING 
>classifying packets to do the routing?".
>We would like for it to send packets that belong to the same 
>bi-directional flow to the same virtual interface, so argus can do its 
>bi-directional thing.  PF_RING claims that you can provide your own 
>classifier logic, which we can do to make this happen.  We have a 
>pretty fast bidirectional hashing scheme which we can try out.
>
>We have a number of people that are using netmap instead of PF_RING.  
>My understanding is that it also has this same type of feature.  If we 
>can get some people talking about that, that would help a bit.
>
>Carter
>
>
>
>On Mar 12, 2014, at 1:03 AM, Reynolds, Jeffrey 
><JReynolds at utdallas.edu<mailto:JReynolds at utdallas.edu><mailto:JReynolds
>@ut
>dallas.edu>> wrote:
>
>Howdy All,
>
>So after forever and a day, I've finally found time to start working on 
>my multi-instanced argus configuration. Here is my setup:
>
>-CentOS 6.5 x64
>-pfring driver compiled from source
>-pfring capable Intel NICs (currently using the ixgbe driver version
>3.15.1-k) (these NICs are in a bonded configuration under a device 
>named
>bond0)
>
>I've configured my startup script to start 5 instances of Argus, each 
>with there own /etc/argusX.conf file (argus1.conf, argus2.conf, etc).
>The start up script correctly assigns the proper pid file to each 
>instance, and everything starts and stops smoothly.  Each instance is 
>writing an output file to /var/argus in the format of argusX.out.  When 
>I first tried running my argus instances, I ran them with a version of 
>PF_RING I had installed from an RPM obtained from the ntop repo.  
>Things didn't seem to work correctly, so I tried again after I had 
>compiled from source.  After compiling from source, I got the following 
>output in /var/log/messages when I started argus:
>
>Mar 11 17:48:16 argus kernel: No module found in object Mar 11 17:49:16 
>argus kernel: [PF_RING] Welcome to PF_RING 5.6.3 ($Revision: 7358$) Mar
>11 17:49:16 argus kernel: (C) 2004-14
>ntop.org<http://ntop.org/><http://ntop.org<http://ntop.org/>>
>Mar 11 17:49:16 argus kernel: [PF_RING] registered /proc/net/pf_ring/ 
>Mar
>11 17:49:16 argus kernel: NET: Registered protocol family 27 Mar 11
>17:49:16 argus kernel: [PF_RING] Min # ring slots 4096
>Mar 11 17:49:16 argus kernel: [PF_RING] Slot version     15
>Mar 11 17:49:16 argus kernel: [PF_RING] Capture TX       Yes [RX+TX]
>Mar 11 17:49:16 argus kernel: [PF_RING] Transparent Mode 0
>Mar 11 17:49:16 argus kernel: [PF_RING] IP Defragment    No
>Mar 11 17:49:16 argus kernel: [PF_RING] Initialized correctly Mar 11
>17:49:35 argus kernel: Bluetooth: Core ver 2.15 Mar 11 17:49:35 argus
>kernel: NET: Registered protocol family 31 Mar 11 17:49:35 argus kernel:
>Bluetooth: HCI device and connection manager initialized Mar 11 
>17:49:35 argus kernel: Bluetooth: HCI socket layer initialized Mar 11 
>17:49:35 argus kernel: Netfilter messages via NETLINK v0.30.
>Mar 11 17:49:35 argus argus[13918]: 11 Mar 14 17:49:35.643243 started 
>Mar
>11 17:49:35 argus argus[13918]: 11 Mar 14 17:49:35.693930 started Mar 
>11
>17:49:35 argus kernel: device bond0 entered promiscuous mode Mar 11
>17:49:35 argus kernel: device em1 entered promiscuous mode Mar 11
>17:49:35 argus kernel: device em2 entered promiscuous mode Mar 11
>17:49:35 argus argus[13918]: 11 Mar 14 17:49:35.721490
>ArgusGetInterfaceStatus: interface bond0 is up Mar 11 17:49:36 argus
>argus[13922]: 11 Mar 14 17:49:36.349202 started Mar 11 17:49:36 argus
>argus[13922]: 11 Mar 14 17:49:36.364625 started Mar 11 17:49:36 argus
>argus[13922]: 11 Mar 14 17:49:36.383623 ArgusGetInterfaceStatus:
>interface bond0 is up Mar 11 17:49:37 argus argus[13926]: 11 Mar 14
>17:49:37.045224 started Mar 11 17:49:37 argus argus[13926]: 11 Mar 14
>17:49:37.060689 started Mar 11 17:49:37 argus argus[13926]: 11 Mar 14
>17:49:37.079706 ArgusGetInterfaceStatus: interface bond0 is up Mar 11
>17:49:37 argus argus[13930]: 11 Mar 14 17:49:37.753278 started Mar 11
>17:49:37 argus argus[13930]: 11 Mar 14 17:49:37.768613 started Mar 11
>17:49:37 argus argus[13930]: 11 Mar 14 17:49:37.785691
>ArgusGetInterfaceStatus: interface bond0 is up Mar 11 17:49:38 argus
>argus[13934]: 11 Mar 14 17:49:38.449229 started Mar 11 17:49:38 argus
>argus[13934]: 11 Mar 14 17:49:38.466365 started Mar 11 17:49:38 argus
>argus[13934]: 11 Mar 14 17:49:38.485675 ArgusGetInterfaceStatus:
>interface bond0 is up
>
>Aside from the "No module found in object" error, everything seems like 
>its working Ok.  The only problem is that I don't seem to have my argus 
>instances configured to pull traffic from the same queue.  In other 
>words, I have five output files from five argus instances with like 
>traffic in all of them.  I haven't made any changes to my argus config 
>files, aside from telling them to write to different locations and the 
>name of the interface. I know I'm missing something but I'm not quite 
>sure what it is.  If someone might be able to tell me how to configure 
>these five instances to pull from the same PF_RING queue, I'd be mighty 
>obliged.  Let me know if I need to submit any additional information.
>
>Thanks,
>
>Jeff Reynolds
>
>





More information about the argus mailing list