Multi-Instanced Argus

Reynolds, Jeffrey JReynolds at utdallas.edu
Fri Mar 28 15:42:54 EDT 2014


Ok, I¹m almost sure there are issues with Argus and the code I¹ve
modified.  To rehash, I¹ve changed line grabbed argus-3.0.7.5 and I¹ve
chagned the following line in argus/ArgusSource.c

4331

- if ((strstr(device->name, "dag")) || (strstr(device->name, "napa"))) {

+ if (strstr(device->name, "dag") || strstr(device->name, "nap") ||
strstr(device->name, "dna") || (strstr(device->name, "eth") &&
strstr(device->name, "@"))) {

I¹ve also tried:

+ if ((strstr(device->name, "dag")) || (strstr(device->name, "nap")) ||
(strstr(device->name, "dna")) || (strstr(device->name, "eth") &&
strstr(device->name, "@"))) {


As I wasn¹t sure if the paren the strstr statements had to be enclosed in
their own set of parens.  Anyway, in both instances, I¹ll try to run Argus
and wind up with a 128 byte file.  For example:

$ argus -i dna0 -w /var/data/argus-out -s 1500
(wait about 20 seconds)
$ ls -l /var/data
-rw-r--r--. 1 argus argus 128 Mar 28 07:46 argus-out

When I run with the vanilla drivers, and my interface is not ³dna0² but
³em1², then I get better results.

# rmmod ixgbe
# modprobe ixgbe #pulling from /lib/modules/`uname -r`

$ rm argus-out
rm: remove regular file `argus-out'? y
$ argus -i em1 -w /var/data/argus-out -s 1500
(wait about 20 seconds)
$ ls -l /var/data
-rw-r--r--. 1 argus argus 2392260 Mar 28 07:46 argus-out


The real kicker seems to be in /var/log/messages.  When running argus on
em1 with the original ixgbe driver, I get the following output in
/var/log/messages:


Mar 28 05:14:52 argus argus[23142]: 28 Mar 14 05:14:52.865660 started
Mar 28 05:14:52 argus argus[23142]: 28 Mar 14 05:14:52.882755 started
Mar 28 05:14:52 argus kernel: device em1 entered promiscuous mode
Mar 28 05:14:52 argus argus[23142]: 28 Mar 14 05:14:52.932220
ArgusGetInterfaceStatus: interface em1 is up
Mar 28 05:15:18 argus argus[23142]: 28 Mar 14 05:15:18.812342 stopped


However, when running with the DNA driver, the output is as follows:

Mar 28 08:33:16 argus argus[23915]: 28 Mar 14 08:33:16.967530 started
Mar 28 08:33:16 argus argus[23915]: 28 Mar 14 08:33:16.985055 started
Mar 28 08:33:50 argus argus[23915]: 28 Mar 14 08:33:50.667199 stopped


Now the interface is in promiscuous mode, I can see the change in received
packets rising considerably by just running ifconfig a few times.  I think
that for whatever reason, the function in Argus that outputs the
³ArgusGetInterfaceStatus² line isn¹t correctly interpreting dna0 as an
appropriate interface.

Does any of this sound remotely possible?

-Jeff



On 3/27/14, 7:23 PM, "Craig Merchant" <cmerchant at responsys.com> wrote:

>Hey, Jeffrey...
>
>The configuration questions for the pf_ring and ixgbe drivers may be
>better answered on the ntop forums...  But I'll do my best.  Here is how
>I load the drivers:
>
>    insmod /lib/modules/2.6.32-220.el6.x86_64/updates/pf_ring.ko
>    /sbin/modprobe ixgbe MQ=0,0 RSS=1,1 num_rx_slots=32768
>
>    ifconfig dna0 up promisc
>    ethtool -K dna0 tso off
>    ethtool -K dna0 gro off
>    ethtool -K dna0 lro off
>    ethtool -K dna0 gso off
>    ethtool -G dna0 tx 32768
>    ethtool -G dna0 rx 32768
>
>One thing I'm not clear on from your config is why you are using
>pfdnacluster_master at all...  That daemon is designed to split up flows
>and/or make copies of them to distribute to other applications.  I don't
>think it's meant to aggregate two interfaces into one stream.  Normally
>it's run with a -n parameter to tell it how many queues you want traffic
>divided up into.  We use:
>
>pfdnacluster_master -d -c 10 -n 28,1 -m 0 -i dna0
>
>In this case, -n says "divide up one copy of the traffic into 28 queues"
>and "create one copy of all the traffic on the last queue".  The apps
>accessing the first 28 queues (Snort) would connect to dnacluster:10 at 0 -
>dnacluster:10 at 27.   Argus connects to dnacluster:10 at 28 and would see a
>copy of all of the traffic.
>
>If all you are looking to do is combine traffic from two interfaces into
>one, why not just run argus with -i dna0,dna1?
>
>For testing, I would try the following to see where you might be having
>problems:
>
>	pfcount -i dna0
>	pfcount -i dna1
>	pfcount -i dna0,dna1
>	pfcount -i dnacluster:10
>	pfcount -i dnacluster:10 at 0
>
>Let me know if that helps...
>
>Craig
>
>
>
>
>-----Original Message-----
>From: Reynolds, Jeffrey [mailto:JReynolds at utdallas.edu]
>Sent: Thursday, March 27, 2014 3:18 PM
>To: Craig Merchant; Carter Bullard
>Cc: Argus
>Subject: Re: [ARGUS] Multi-Instanced Argus
>
>So I understand this is from a while ago, but here is what I have.
>Craig, maybe you can show me how I'm doing it wrong.
>
>I finally got PF_Ring and libzero licensed correctly so that pfdnacluster
>isn't limited to 5 minutes of capture.  I downloaded the Argus source,
>installed the dependencies, and compiled after making the change you
>noted below.  However, I don't seem to be properly attaching argus to my
>devices to allow it to capture.  I have a feeling its something to do
>with my PF_Ring or dna-ixgbe conf files.  We have two interfaces to
>monitor, which I've previously combined into one by using
>pfdnacluster_master.  However, it looks like I can't get Argus to hook
>into that or a single dan interface.  Anyway, after make installing, I
>run the following command with the following result:
>
>#pfdnacluster_master -i dna0,dna1 -c 10
>#argus -i dnacluster:10 -s 1500 -w /var/data/argus-out
>
>My /var/log/messages says that the specified interface doesn't exist,
>which I kind of expected.
>So I tried this (without pfdnacluster running):
>
>#argus -i dna0 -s 1500 -w /var/data/argus-out
>
>This time argus appears to have started, but my output file is not
>growing (it initial starts at 128 bytes and increases by that same amount
>every 30 seconds or so).
>
>In case this happens to be the parameters I'm loading with my kernel
>modules, here they are:
>
>pf_ring.ko transparenet_mode=2
>(I've also tried 0, with similar results) ixgbe.ko RSS=1,1,1,1 (I wasn't
>seeing all of the traffic from my interfaces with the default config, the
>ntop folks recommended this, I need to dig further into the docs to learn
>more about these parameters).
>
>To answer your original question, I'm only monitoring about ~2Gbps,
>significantly less then you are.  I'm not sure if what I've noticed would
>be considered "gaps", but we do see exchanges where the server appears to
>initiate conversations by sending a response to a client, which the
>client doesn't appear to have requested.  I'm guess the missing request
>was most likely a packet that didn't get captured.
>
>Any configuration suggestions would be much appreciated.
>
>Thanks,
>
>Jeff
>
>
>From: Craig Merchant
><cmerchant at responsys.com<mailto:cmerchant at responsys.com>>
>Date: Wednesday, March 12, 2014 at 6:39 PM
>To: Carter Bullard <carter at qosient.com<mailto:carter at qosient.com>>, Jeff
>Reynolds <jjr140030 at utdallas.edu<mailto:jjr140030 at utdallas.edu>>
>Cc: Argus 
><argus-info at lists.andrew.cmu.edu<mailto:argus-info at lists.andrew.cmu.edu>>
>Subject: RE: [ARGUS] Multi-Instanced Argus
>
>We're running Argus and Snort of PF_RING's DNA/Libzero drivers.  We
>decided to use Libzero because the standard DNA drivers limit the number
>of memory "queues" containing network traffic to 16.  Each queue can only
>be accessed by a single process and our sensors have 32 cores, so we
>wouldn't be able to run the maximum number of Snort instances without it.
>
>We use the pfdnaclustermaster app to spread flows across 28 queues for
>snort and also maintain a copy of all flows in a queue for Argus.
>
>To get it to work, all I had to do was make a slight edit to
>ArgusSource.c so that Argus would recognize DNA/Libzero queues as a valid
>interface.
>
>Somewhere around line 4191 (for argus 3.0.7):
>
>
>-   if ((strstr(device->name, "dag")) || (strstr(device->name, "napa"))) {
>
>+ if (strstr(device->name, "dag") || strstr(device->name, "nap") ||
>+ strstr(device->name, "dna") || (strstr(device->name, "eth") &&
>+ strstr(device->name, "@"))) {
>
>Our data centers do around 4-8 Gbps 24/7.  From what I recall, there is
>(or was) a bug in PF_RING that caused Argus to run at 100% all of the
>time, but in my experience Argus wasn't having problems keeping up with
>our volume of data.  We did see an unusually high number of flows that
>Argus couldn't determine the direction of, but we weren't seeing gaps in
>the packets or anything else to suggest that Argus couldn't handle the
>volume.
>
>How much traffic are you sending at Argus?  Have you tried searching your
>Argus records for flows that have gaps in them?  That would be a pretty
>good indicator that Argus may have trouble keeping up.  Or that your SPAN
>port can't handle the load...
>
>Thx.
>
>Craig
>
>From: 
>argus-info-bounces+cmerchant=responsys.com at lists.andrew.cmu.edu<mailto:arg
>us-info-bounces+cmerchant=responsys.com at lists.andrew.cmu.edu>
>[mailto:argus-info-bounces+cmerchant=responsys.com at lists.andrew.cmu.edu]
>On Behalf Of Carter Bullard
>Sent: Wednesday, March 12, 2014 1:57 PM
>To: Reynolds, Jeffrey
>Cc: Argus
>Subject: Re: [ARGUS] Multi-Instanced Argus
>
>Hey Jeffery,
>Good so far.   This seem like the link for accelerating snort with
>PF_RING DNA ??
>http://www.ntop.org/pf_ring/accelerating-snort-with-pf_ring-dna/
>
>I'm interested in the symmetric RSS and if it works properly.
>Are you running the PF_RING DNA DAQ ????
>
>It would seem that we'll have to modify argus to use this facility ???
>
>Carter
>
>On Mar 12, 2014, at 3:26 PM, Reynolds, Jeffrey
><JReynolds at utdallas.edu<mailto:JReynolds at utdallas.edu>> wrote:
>
>
>First, before we dive into to it too deep, how is the performance ??
>
>This actually seems like a great place to start.  Before getting too
>heavy into PF_RING integration, maybe I should offer a bit of backstory.
>Our main goal is just to archive traffic.  We have a server running
>CentOS 6 that receives traffic from two SPAN ports.  The only thing we
>want to accomplish is to maintain a copy of that traffic for some period
>of time.  Argus was used because it seemed to be the best tool for the
>price, and it comes with a lot of great features that while we may not
>use now, we may use later (again, for right now all we want is a copy of
>the traffic to be able to perform forensics on).
>
>Now, I put up a single instance of Argus and pointed it at the interface
>that was the master of our two bonded physical NICs (eth0 and eth1 are
>bonded to bond0).  I let it run for an hour to get some preliminary
>numbers.  I ran an recount against my output file and got the following
>stats:
>
>racount -t 2014y3m12d05h -r argus-out
>racount records total_pkts src_pkts dst_pkts total_bytes src_bytes
>dst_bytes sum 14236180 187526800 98831765 88695035 212079839908
>102889789820 109190050088
>
>However, the switch the switch sending that traffic reported that it had
>sent a total of 421,978,297 packets to both interfaces, and a total of
>371,307,051,815 bytes for that time frame.  I could be interpreting
>something incorrectly, so maybe the best first thing for me to confirm is
>that we are in fact losing a lot of traffic.  But it seems that a single
>argus instance can't keep up with the traffic.  I've seen this happen
>with Snort, and our solution was to plug Snort into PF_RING to allow the
>traffic to be intelligently forwarded via the Snort Data Acquisition
>Library (DAQ).  From the perspective of someone who hasn't had a lot of
>exposure to this level of hardware configuration, it was relatively easy
>to plug the configuration parameters in at the Snort command line to have
>them all point at the same traffic source so that each individual process
>didn't run through the same traffic.  My hope was that there might just
>be some parameters to set within the argus.conf file which would tell
>each process to pull from a single PF_RING source.  However, it looks
>like this might not be as easy as I had once thought.
>
>Am I on the right track or does this make even a little sense?
>
>Thanks,
>
>Jeff
>
>
>
>From: Carter Bullard
><carter at qosient.com<mailto:carter at qosient.com><mailto:carter at qosient.com>>
>Date: Wednesday, March 12, 2014 at 9:54 AM
>To: "Reynolds, Jeffrey"
><JReynolds at utdallas.edu<mailto:JReynolds at utdallas.edu><mailto:JReynolds at ut
>dallas.edu>>
>Cc: Argus 
><argus-info at lists.andrew.cmu.edu<mailto:argus-info at lists.andrew.cmu.edu><m
>ailto:argus-info at lists.andrew.cmu.edu>>
>Subject: Re: [ARGUS] Multi-Instanced Argus
>
>Hey Jeffrey,
>I am very interested in this approach, but I have no experience with this
>PF_RING feature, so I'll have to give you the "design response".
>Hopefully, we can get this to where its doing exactly what anyone would
>want it to do, and get us a really fast argus, on the cheap.
>
>First, before we dive into to it too deep, how is the performance ??  Are
>you getting bi-directional flows out of this scheme ??  Are you seeing
>all the traffic ???  If so, then congratulations !!!  If the performance
>is good, your seeing all the traffic, but you're only getting
>uni-directional flows, then we may have some work to do, but still
>congratulations !!!  If you're not getting all the traffic then we have
>some real work to do, as one of the purposes of argus is to monitor all
>the traffic.
>
>OK, so my understanding is that the PF_RING can do some packet routing to
>a non-overlapping set of tap interfaces.  Routing is based on some
>classification scheme, designed to make this usable. The purpose is to
>provide coarse grain parallelism for packet processing.  The idea, as
>much as I can tell, is to prevent multiple readers from having to read
>from the same queue; eliminating locking issues, which kills performance
>etc...
>
>So, I'm not sure what you mean by "pulling from the same queue".  If you
>do have multiple argi reading the same packet, you will end up counting a
>single packet multiple times.  Not a terrible thing, but not recommended.
> Its not that you're creating multiple observation domains using this
>PF_RING technique. You're really splitting a single packet observation
>domain into a multi-sensor facility ... eventually you will want to
>combine the total argus output into a single output stream, that
>represents the single packet observation domain.  At least that is my
>thinking, and I would recommend that you use radium to connect to all of
>your argus instances, rather than writing the argus output to a set of
>files.  Radium will generate a single argus data output stream,
>representing the argus data from the single observation domain.
>
>The design issue of using the PF_RING function is "how is PF_RING
>classifying packets to do the routing?".
>We would like for it to send packets that belong to the same
>bi-directional flow to the same virtual interface, so argus can do its
>bi-directional thing.  PF_RING claims that you can provide your own
>classifier logic, which we can do to make this happen.  We have a pretty
>fast bidirectional hashing scheme which we can try out.
>
>We have a number of people that are using netmap instead of PF_RING.  My
>understanding is that it also has this same type of feature.  If we can
>get some people talking about that, that would help a bit.
>
>Carter
>
>
>
>On Mar 12, 2014, at 1:03 AM, Reynolds, Jeffrey
><JReynolds at utdallas.edu<mailto:JReynolds at utdallas.edu><mailto:JReynolds at ut
>dallas.edu>> wrote:
>
>Howdy All,
>
>So after forever and a day, I've finally found time to start working on
>my multi-instanced argus configuration. Here is my setup:
>
>-CentOS 6.5 x64
>-pfring driver compiled from source
>-pfring capable Intel NICs (currently using the ixgbe driver version
>3.15.1-k) (these NICs are in a bonded configuration under a device named
>bond0)
>
>I've configured my startup script to start 5 instances of Argus, each
>with there own /etc/argusX.conf file (argus1.conf, argus2.conf, etc).
>The start up script correctly assigns the proper pid file to each
>instance, and everything starts and stops smoothly.  Each instance is
>writing an output file to /var/argus in the format of argusX.out.  When I
>first tried running my argus instances, I ran them with a version of
>PF_RING I had installed from an RPM obtained from the ntop repo.  Things
>didn't seem to work correctly, so I tried again after I had compiled from
>source.  After compiling from source, I got the following output in
>/var/log/messages when I started argus:
>
>Mar 11 17:48:16 argus kernel: No module found in object Mar 11 17:49:16
>argus kernel: [PF_RING] Welcome to PF_RING 5.6.3 ($Revision: 7358$) Mar
>11 17:49:16 argus kernel: (C) 2004-14
>ntop.org<http://ntop.org/><http://ntop.org<http://ntop.org/>>
>Mar 11 17:49:16 argus kernel: [PF_RING] registered /proc/net/pf_ring/ Mar
>11 17:49:16 argus kernel: NET: Registered protocol family 27 Mar 11
>17:49:16 argus kernel: [PF_RING] Min # ring slots 4096
>Mar 11 17:49:16 argus kernel: [PF_RING] Slot version     15
>Mar 11 17:49:16 argus kernel: [PF_RING] Capture TX       Yes [RX+TX]
>Mar 11 17:49:16 argus kernel: [PF_RING] Transparent Mode 0
>Mar 11 17:49:16 argus kernel: [PF_RING] IP Defragment    No
>Mar 11 17:49:16 argus kernel: [PF_RING] Initialized correctly Mar 11
>17:49:35 argus kernel: Bluetooth: Core ver 2.15 Mar 11 17:49:35 argus
>kernel: NET: Registered protocol family 31 Mar 11 17:49:35 argus kernel:
>Bluetooth: HCI device and connection manager initialized Mar 11 17:49:35
>argus kernel: Bluetooth: HCI socket layer initialized Mar 11 17:49:35
>argus kernel: Netfilter messages via NETLINK v0.30.
>Mar 11 17:49:35 argus argus[13918]: 11 Mar 14 17:49:35.643243 started Mar
>11 17:49:35 argus argus[13918]: 11 Mar 14 17:49:35.693930 started Mar 11
>17:49:35 argus kernel: device bond0 entered promiscuous mode Mar 11
>17:49:35 argus kernel: device em1 entered promiscuous mode Mar 11
>17:49:35 argus kernel: device em2 entered promiscuous mode Mar 11
>17:49:35 argus argus[13918]: 11 Mar 14 17:49:35.721490
>ArgusGetInterfaceStatus: interface bond0 is up Mar 11 17:49:36 argus
>argus[13922]: 11 Mar 14 17:49:36.349202 started Mar 11 17:49:36 argus
>argus[13922]: 11 Mar 14 17:49:36.364625 started Mar 11 17:49:36 argus
>argus[13922]: 11 Mar 14 17:49:36.383623 ArgusGetInterfaceStatus:
>interface bond0 is up Mar 11 17:49:37 argus argus[13926]: 11 Mar 14
>17:49:37.045224 started Mar 11 17:49:37 argus argus[13926]: 11 Mar 14
>17:49:37.060689 started Mar 11 17:49:37 argus argus[13926]: 11 Mar 14
>17:49:37.079706 ArgusGetInterfaceStatus: interface bond0 is up Mar 11
>17:49:37 argus argus[13930]: 11 Mar 14 17:49:37.753278 started Mar 11
>17:49:37 argus argus[13930]: 11 Mar 14 17:49:37.768613 started Mar 11
>17:49:37 argus argus[13930]: 11 Mar 14 17:49:37.785691
>ArgusGetInterfaceStatus: interface bond0 is up Mar 11 17:49:38 argus
>argus[13934]: 11 Mar 14 17:49:38.449229 started Mar 11 17:49:38 argus
>argus[13934]: 11 Mar 14 17:49:38.466365 started Mar 11 17:49:38 argus
>argus[13934]: 11 Mar 14 17:49:38.485675 ArgusGetInterfaceStatus:
>interface bond0 is up
>
>Aside from the "No module found in object" error, everything seems like
>its working Ok.  The only problem is that I don't seem to have my argus
>instances configured to pull traffic from the same queue.  In other
>words, I have five output files from five argus instances with like
>traffic in all of them.  I haven't made any changes to my argus config
>files, aside from telling them to write to different locations and the
>name of the interface. I know I'm missing something but I'm not quite
>sure what it is.  If someone might be able to tell me how to configure
>these five instances to pull from the same PF_RING queue, I'd be mighty
>obliged.  Let me know if I need to submit any additional information.
>
>Thanks,
>
>Jeff Reynolds
>
>




More information about the argus mailing list