Multi-Instanced Argus
Craig Merchant
cmerchant at responsys.com
Fri Mar 28 16:14:54 EDT 2014
That sounds possible... Carter is probably the better person to ask. I haven't had to recompile Argus for almost a year, so all I can say is that it's working fine for me with an older version.
Carter... what would it take to get DNA and libzero interfaces officially supported?
Thx.
Craig
-----Original Message-----
From: Reynolds, Jeffrey [mailto:JReynolds at utdallas.edu]
Sent: Friday, March 28, 2014 12:43 PM
To: Carter Bullard; Craig Merchant
Cc: Argus
Subject: Re: [ARGUS] Multi-Instanced Argus
Ok, I¹m almost sure there are issues with Argus and the code I¹ve modified. To rehash, I¹ve changed line grabbed argus-3.0.7.5 and I¹ve chagned the following line in argus/ArgusSource.c
4331
- if ((strstr(device->name, "dag")) || (strstr(device->name, "napa"))) {
+ if (strstr(device->name, "dag") || strstr(device->name, "nap") ||
strstr(device->name, "dna") || (strstr(device->name, "eth") && strstr(device->name, "@"))) {
I¹ve also tried:
+ if ((strstr(device->name, "dag")) || (strstr(device->name, "nap")) ||
(strstr(device->name, "dna")) || (strstr(device->name, "eth") && strstr(device->name, "@"))) {
As I wasn¹t sure if the paren the strstr statements had to be enclosed in their own set of parens. Anyway, in both instances, I¹ll try to run Argus and wind up with a 128 byte file. For example:
$ argus -i dna0 -w /var/data/argus-out -s 1500 (wait about 20 seconds) $ ls -l /var/data -rw-r--r--. 1 argus argus 128 Mar 28 07:46 argus-out
When I run with the vanilla drivers, and my interface is not ³dna0² but ³em1², then I get better results.
# rmmod ixgbe
# modprobe ixgbe #pulling from /lib/modules/`uname -r`
$ rm argus-out
rm: remove regular file `argus-out'? y
$ argus -i em1 -w /var/data/argus-out -s 1500 (wait about 20 seconds) $ ls -l /var/data -rw-r--r--. 1 argus argus 2392260 Mar 28 07:46 argus-out
The real kicker seems to be in /var/log/messages. When running argus on
em1 with the original ixgbe driver, I get the following output in
/var/log/messages:
Mar 28 05:14:52 argus argus[23142]: 28 Mar 14 05:14:52.865660 started Mar 28 05:14:52 argus argus[23142]: 28 Mar 14 05:14:52.882755 started Mar 28 05:14:52 argus kernel: device em1 entered promiscuous mode Mar 28 05:14:52 argus argus[23142]: 28 Mar 14 05:14:52.932220
ArgusGetInterfaceStatus: interface em1 is up Mar 28 05:15:18 argus argus[23142]: 28 Mar 14 05:15:18.812342 stopped
However, when running with the DNA driver, the output is as follows:
Mar 28 08:33:16 argus argus[23915]: 28 Mar 14 08:33:16.967530 started Mar 28 08:33:16 argus argus[23915]: 28 Mar 14 08:33:16.985055 started Mar 28 08:33:50 argus argus[23915]: 28 Mar 14 08:33:50.667199 stopped
Now the interface is in promiscuous mode, I can see the change in received packets rising considerably by just running ifconfig a few times. I think that for whatever reason, the function in Argus that outputs the ³ArgusGetInterfaceStatus² line isn¹t correctly interpreting dna0 as an appropriate interface.
Does any of this sound remotely possible?
-Jeff
On 3/27/14, 7:23 PM, "Craig Merchant" <cmerchant at responsys.com> wrote:
>Hey, Jeffrey...
>
>The configuration questions for the pf_ring and ixgbe drivers may be
>better answered on the ntop forums... But I'll do my best. Here is
>how I load the drivers:
>
> insmod /lib/modules/2.6.32-220.el6.x86_64/updates/pf_ring.ko
> /sbin/modprobe ixgbe MQ=0,0 RSS=1,1 num_rx_slots=32768
>
> ifconfig dna0 up promisc
> ethtool -K dna0 tso off
> ethtool -K dna0 gro off
> ethtool -K dna0 lro off
> ethtool -K dna0 gso off
> ethtool -G dna0 tx 32768
> ethtool -G dna0 rx 32768
>
>One thing I'm not clear on from your config is why you are using
>pfdnacluster_master at all... That daemon is designed to split up
>flows and/or make copies of them to distribute to other applications.
>I don't think it's meant to aggregate two interfaces into one stream.
>Normally it's run with a -n parameter to tell it how many queues you
>want traffic divided up into. We use:
>
>pfdnacluster_master -d -c 10 -n 28,1 -m 0 -i dna0
>
>In this case, -n says "divide up one copy of the traffic into 28 queues"
>and "create one copy of all the traffic on the last queue". The apps
>accessing the first 28 queues (Snort) would connect to dnacluster:10 at 0 -
>dnacluster:10 at 27. Argus connects to dnacluster:10 at 28 and would see a
>copy of all of the traffic.
>
>If all you are looking to do is combine traffic from two interfaces
>into one, why not just run argus with -i dna0,dna1?
>
>For testing, I would try the following to see where you might be having
>problems:
>
> pfcount -i dna0
> pfcount -i dna1
> pfcount -i dna0,dna1
> pfcount -i dnacluster:10
> pfcount -i dnacluster:10 at 0
>
>Let me know if that helps...
>
>Craig
>
>
>
>
>-----Original Message-----
>From: Reynolds, Jeffrey [mailto:JReynolds at utdallas.edu]
>Sent: Thursday, March 27, 2014 3:18 PM
>To: Craig Merchant; Carter Bullard
>Cc: Argus
>Subject: Re: [ARGUS] Multi-Instanced Argus
>
>So I understand this is from a while ago, but here is what I have.
>Craig, maybe you can show me how I'm doing it wrong.
>
>I finally got PF_Ring and libzero licensed correctly so that
>pfdnacluster isn't limited to 5 minutes of capture. I downloaded the
>Argus source, installed the dependencies, and compiled after making the
>change you noted below. However, I don't seem to be properly attaching
>argus to my devices to allow it to capture. I have a feeling its
>something to do with my PF_Ring or dna-ixgbe conf files. We have two
>interfaces to monitor, which I've previously combined into one by using
>pfdnacluster_master. However, it looks like I can't get Argus to hook
>into that or a single dan interface. Anyway, after make installing, I
>run the following command with the following result:
>
>#pfdnacluster_master -i dna0,dna1 -c 10 #argus -i dnacluster:10 -s 1500
>-w /var/data/argus-out
>
>My /var/log/messages says that the specified interface doesn't exist,
>which I kind of expected.
>So I tried this (without pfdnacluster running):
>
>#argus -i dna0 -s 1500 -w /var/data/argus-out
>
>This time argus appears to have started, but my output file is not
>growing (it initial starts at 128 bytes and increases by that same
>amount every 30 seconds or so).
>
>In case this happens to be the parameters I'm loading with my kernel
>modules, here they are:
>
>pf_ring.ko transparenet_mode=2
>(I've also tried 0, with similar results) ixgbe.ko RSS=1,1,1,1 (I
>wasn't seeing all of the traffic from my interfaces with the default
>config, the ntop folks recommended this, I need to dig further into the
>docs to learn more about these parameters).
>
>To answer your original question, I'm only monitoring about ~2Gbps,
>significantly less then you are. I'm not sure if what I've noticed
>would be considered "gaps", but we do see exchanges where the server
>appears to initiate conversations by sending a response to a client,
>which the client doesn't appear to have requested. I'm guess the
>missing request was most likely a packet that didn't get captured.
>
>Any configuration suggestions would be much appreciated.
>
>Thanks,
>
>Jeff
>
>
>From: Craig Merchant
><cmerchant at responsys.com<mailto:cmerchant at responsys.com>>
>Date: Wednesday, March 12, 2014 at 6:39 PM
>To: Carter Bullard <carter at qosient.com<mailto:carter at qosient.com>>,
>Jeff Reynolds <jjr140030 at utdallas.edu<mailto:jjr140030 at utdallas.edu>>
>Cc: Argus
><argus-info at lists.andrew.cmu.edu<mailto:argus-info at lists.andrew.cmu.edu
>>>
>Subject: RE: [ARGUS] Multi-Instanced Argus
>
>We're running Argus and Snort of PF_RING's DNA/Libzero drivers. We
>decided to use Libzero because the standard DNA drivers limit the
>number of memory "queues" containing network traffic to 16. Each queue
>can only be accessed by a single process and our sensors have 32 cores,
>so we wouldn't be able to run the maximum number of Snort instances without it.
>
>We use the pfdnaclustermaster app to spread flows across 28 queues for
>snort and also maintain a copy of all flows in a queue for Argus.
>
>To get it to work, all I had to do was make a slight edit to
>ArgusSource.c so that Argus would recognize DNA/Libzero queues as a
>valid interface.
>
>Somewhere around line 4191 (for argus 3.0.7):
>
>
>- if ((strstr(device->name, "dag")) || (strstr(device->name, "napa"))) {
>
>+ if (strstr(device->name, "dag") || strstr(device->name, "nap") ||
>+ strstr(device->name, "dna") || (strstr(device->name, "eth") &&
>+ strstr(device->name, "@"))) {
>
>Our data centers do around 4-8 Gbps 24/7. From what I recall, there is
>(or was) a bug in PF_RING that caused Argus to run at 100% all of the
>time, but in my experience Argus wasn't having problems keeping up with
>our volume of data. We did see an unusually high number of flows that
>Argus couldn't determine the direction of, but we weren't seeing gaps
>in the packets or anything else to suggest that Argus couldn't handle
>the volume.
>
>How much traffic are you sending at Argus? Have you tried searching
>your Argus records for flows that have gaps in them? That would be a
>pretty good indicator that Argus may have trouble keeping up. Or that
>your SPAN port can't handle the load...
>
>Thx.
>
>Craig
>
>From:
>argus-info-bounces+cmerchant=responsys.com at lists.andrew.cmu.edu<mailto:
>argus-info-bounces+arg
>us-info-bounces+cmerchant=responsys.com at lists.andrew.cmu.edu>
>[mailto:argus-info-bounces+cmerchant=responsys.com at lists.andrew.cmu.edu
>]
>On Behalf Of Carter Bullard
>Sent: Wednesday, March 12, 2014 1:57 PM
>To: Reynolds, Jeffrey
>Cc: Argus
>Subject: Re: [ARGUS] Multi-Instanced Argus
>
>Hey Jeffery,
>Good so far. This seem like the link for accelerating snort with
>PF_RING DNA ??
>http://www.ntop.org/pf_ring/accelerating-snort-with-pf_ring-dna/
>
>I'm interested in the symmetric RSS and if it works properly.
>Are you running the PF_RING DNA DAQ ????
>
>It would seem that we'll have to modify argus to use this facility ???
>
>Carter
>
>On Mar 12, 2014, at 3:26 PM, Reynolds, Jeffrey
><JReynolds at utdallas.edu<mailto:JReynolds at utdallas.edu>> wrote:
>
>
>First, before we dive into to it too deep, how is the performance ??
>
>This actually seems like a great place to start. Before getting too
>heavy into PF_RING integration, maybe I should offer a bit of backstory.
>Our main goal is just to archive traffic. We have a server running
>CentOS 6 that receives traffic from two SPAN ports. The only thing we
>want to accomplish is to maintain a copy of that traffic for some
>period of time. Argus was used because it seemed to be the best tool
>for the price, and it comes with a lot of great features that while we
>may not use now, we may use later (again, for right now all we want is
>a copy of the traffic to be able to perform forensics on).
>
>Now, I put up a single instance of Argus and pointed it at the
>interface that was the master of our two bonded physical NICs (eth0 and
>eth1 are bonded to bond0). I let it run for an hour to get some
>preliminary numbers. I ran an recount against my output file and got
>the following
>stats:
>
>racount -t 2014y3m12d05h -r argus-out
>racount records total_pkts src_pkts dst_pkts total_bytes src_bytes
>dst_bytes sum 14236180 187526800 98831765 88695035 212079839908
>102889789820 109190050088
>
>However, the switch the switch sending that traffic reported that it
>had sent a total of 421,978,297 packets to both interfaces, and a total
>of
>371,307,051,815 bytes for that time frame. I could be interpreting
>something incorrectly, so maybe the best first thing for me to confirm
>is that we are in fact losing a lot of traffic. But it seems that a
>single argus instance can't keep up with the traffic. I've seen this
>happen with Snort, and our solution was to plug Snort into PF_RING to
>allow the traffic to be intelligently forwarded via the Snort Data
>Acquisition Library (DAQ). From the perspective of someone who hasn't
>had a lot of exposure to this level of hardware configuration, it was
>relatively easy to plug the configuration parameters in at the Snort
>command line to have them all point at the same traffic source so that
>each individual process didn't run through the same traffic. My hope
>was that there might just be some parameters to set within the
>argus.conf file which would tell each process to pull from a single
>PF_RING source. However, it looks like this might not be as easy as I had once thought.
>
>Am I on the right track or does this make even a little sense?
>
>Thanks,
>
>Jeff
>
>
>
>From: Carter Bullard
><carter at qosient.com<mailto:carter at qosient.com><mailto:carter at qosient.co
>m>>
>Date: Wednesday, March 12, 2014 at 9:54 AM
>To: "Reynolds, Jeffrey"
><JReynolds at utdallas.edu<mailto:JReynolds at utdallas.edu><mailto:JReynolds
>@ut
>dallas.edu>>
>Cc: Argus
><argus-info at lists.andrew.cmu.edu<mailto:argus-info at lists.andrew.cmu.edu
>><m ailto:argus-info at lists.andrew.cmu.edu>>
>Subject: Re: [ARGUS] Multi-Instanced Argus
>
>Hey Jeffrey,
>I am very interested in this approach, but I have no experience with
>this PF_RING feature, so I'll have to give you the "design response".
>Hopefully, we can get this to where its doing exactly what anyone would
>want it to do, and get us a really fast argus, on the cheap.
>
>First, before we dive into to it too deep, how is the performance ??
>Are you getting bi-directional flows out of this scheme ?? Are you
>seeing all the traffic ??? If so, then congratulations !!! If the
>performance is good, your seeing all the traffic, but you're only
>getting uni-directional flows, then we may have some work to do, but
>still congratulations !!! If you're not getting all the traffic then
>we have some real work to do, as one of the purposes of argus is to
>monitor all the traffic.
>
>OK, so my understanding is that the PF_RING can do some packet routing
>to a non-overlapping set of tap interfaces. Routing is based on some
>classification scheme, designed to make this usable. The purpose is to
>provide coarse grain parallelism for packet processing. The idea, as
>much as I can tell, is to prevent multiple readers from having to read
>from the same queue; eliminating locking issues, which kills
>performance etc...
>
>So, I'm not sure what you mean by "pulling from the same queue". If
>you do have multiple argi reading the same packet, you will end up
>counting a single packet multiple times. Not a terrible thing, but not recommended.
> Its not that you're creating multiple observation domains using this
>PF_RING technique. You're really splitting a single packet observation
>domain into a multi-sensor facility ... eventually you will want to
>combine the total argus output into a single output stream, that
>represents the single packet observation domain. At least that is my
>thinking, and I would recommend that you use radium to connect to all
>of your argus instances, rather than writing the argus output to a set
>of files. Radium will generate a single argus data output stream,
>representing the argus data from the single observation domain.
>
>The design issue of using the PF_RING function is "how is PF_RING
>classifying packets to do the routing?".
>We would like for it to send packets that belong to the same
>bi-directional flow to the same virtual interface, so argus can do its
>bi-directional thing. PF_RING claims that you can provide your own
>classifier logic, which we can do to make this happen. We have a
>pretty fast bidirectional hashing scheme which we can try out.
>
>We have a number of people that are using netmap instead of PF_RING.
>My understanding is that it also has this same type of feature. If we
>can get some people talking about that, that would help a bit.
>
>Carter
>
>
>
>On Mar 12, 2014, at 1:03 AM, Reynolds, Jeffrey
><JReynolds at utdallas.edu<mailto:JReynolds at utdallas.edu><mailto:JReynolds
>@ut
>dallas.edu>> wrote:
>
>Howdy All,
>
>So after forever and a day, I've finally found time to start working on
>my multi-instanced argus configuration. Here is my setup:
>
>-CentOS 6.5 x64
>-pfring driver compiled from source
>-pfring capable Intel NICs (currently using the ixgbe driver version
>3.15.1-k) (these NICs are in a bonded configuration under a device
>named
>bond0)
>
>I've configured my startup script to start 5 instances of Argus, each
>with there own /etc/argusX.conf file (argus1.conf, argus2.conf, etc).
>The start up script correctly assigns the proper pid file to each
>instance, and everything starts and stops smoothly. Each instance is
>writing an output file to /var/argus in the format of argusX.out. When
>I first tried running my argus instances, I ran them with a version of
>PF_RING I had installed from an RPM obtained from the ntop repo.
>Things didn't seem to work correctly, so I tried again after I had
>compiled from source. After compiling from source, I got the following
>output in /var/log/messages when I started argus:
>
>Mar 11 17:48:16 argus kernel: No module found in object Mar 11 17:49:16
>argus kernel: [PF_RING] Welcome to PF_RING 5.6.3 ($Revision: 7358$) Mar
>11 17:49:16 argus kernel: (C) 2004-14
>ntop.org<http://ntop.org/><http://ntop.org<http://ntop.org/>>
>Mar 11 17:49:16 argus kernel: [PF_RING] registered /proc/net/pf_ring/
>Mar
>11 17:49:16 argus kernel: NET: Registered protocol family 27 Mar 11
>17:49:16 argus kernel: [PF_RING] Min # ring slots 4096
>Mar 11 17:49:16 argus kernel: [PF_RING] Slot version 15
>Mar 11 17:49:16 argus kernel: [PF_RING] Capture TX Yes [RX+TX]
>Mar 11 17:49:16 argus kernel: [PF_RING] Transparent Mode 0
>Mar 11 17:49:16 argus kernel: [PF_RING] IP Defragment No
>Mar 11 17:49:16 argus kernel: [PF_RING] Initialized correctly Mar 11
>17:49:35 argus kernel: Bluetooth: Core ver 2.15 Mar 11 17:49:35 argus
>kernel: NET: Registered protocol family 31 Mar 11 17:49:35 argus kernel:
>Bluetooth: HCI device and connection manager initialized Mar 11
>17:49:35 argus kernel: Bluetooth: HCI socket layer initialized Mar 11
>17:49:35 argus kernel: Netfilter messages via NETLINK v0.30.
>Mar 11 17:49:35 argus argus[13918]: 11 Mar 14 17:49:35.643243 started
>Mar
>11 17:49:35 argus argus[13918]: 11 Mar 14 17:49:35.693930 started Mar
>11
>17:49:35 argus kernel: device bond0 entered promiscuous mode Mar 11
>17:49:35 argus kernel: device em1 entered promiscuous mode Mar 11
>17:49:35 argus kernel: device em2 entered promiscuous mode Mar 11
>17:49:35 argus argus[13918]: 11 Mar 14 17:49:35.721490
>ArgusGetInterfaceStatus: interface bond0 is up Mar 11 17:49:36 argus
>argus[13922]: 11 Mar 14 17:49:36.349202 started Mar 11 17:49:36 argus
>argus[13922]: 11 Mar 14 17:49:36.364625 started Mar 11 17:49:36 argus
>argus[13922]: 11 Mar 14 17:49:36.383623 ArgusGetInterfaceStatus:
>interface bond0 is up Mar 11 17:49:37 argus argus[13926]: 11 Mar 14
>17:49:37.045224 started Mar 11 17:49:37 argus argus[13926]: 11 Mar 14
>17:49:37.060689 started Mar 11 17:49:37 argus argus[13926]: 11 Mar 14
>17:49:37.079706 ArgusGetInterfaceStatus: interface bond0 is up Mar 11
>17:49:37 argus argus[13930]: 11 Mar 14 17:49:37.753278 started Mar 11
>17:49:37 argus argus[13930]: 11 Mar 14 17:49:37.768613 started Mar 11
>17:49:37 argus argus[13930]: 11 Mar 14 17:49:37.785691
>ArgusGetInterfaceStatus: interface bond0 is up Mar 11 17:49:38 argus
>argus[13934]: 11 Mar 14 17:49:38.449229 started Mar 11 17:49:38 argus
>argus[13934]: 11 Mar 14 17:49:38.466365 started Mar 11 17:49:38 argus
>argus[13934]: 11 Mar 14 17:49:38.485675 ArgusGetInterfaceStatus:
>interface bond0 is up
>
>Aside from the "No module found in object" error, everything seems like
>its working Ok. The only problem is that I don't seem to have my argus
>instances configured to pull traffic from the same queue. In other
>words, I have five output files from five argus instances with like
>traffic in all of them. I haven't made any changes to my argus config
>files, aside from telling them to write to different locations and the
>name of the interface. I know I'm missing something but I'm not quite
>sure what it is. If someone might be able to tell me how to configure
>these five instances to pull from the same PF_RING queue, I'd be mighty
>obliged. Let me know if I need to submit any additional information.
>
>Thanks,
>
>Jeff Reynolds
>
>
More information about the argus
mailing list