Multi-Instanced Argus

Carter Bullard carter at qosient.com
Sat Mar 29 11:00:06 EDT 2014


Hey Jeffery,
Sorry for the delayed response...  and thanks Craig for taking the thread !!!   The 128 byte records are management records, which are basically keep alive like status messages for down stream readers of data.  They indicate that the sensor is alive.

But you definately aren't getting any packets from the interfaces.   You shouldn't need to modify the source for this to work.  I'm pretty sure Craig doesn't modify his.  So with a standard release, run argus the way you think you should with the -D8 option, so we can see what is up for 5-10 seconds or so, and send the output to the list.

We should see a statement that the interface is up.  We need to get that far before we'll try to read packets.

Carter


> On Mar 28, 2014, at 3:42 PM, "Reynolds, Jeffrey" <JReynolds at utdallas.edu> wrote:
> 
> Ok, I¹m almost sure there are issues with Argus and the code I¹ve
> modified.  To rehash, I¹ve changed line grabbed argus-3.0.7.5 and I¹ve
> chagned the following line in argus/ArgusSource.c
> 
> 4331
> 
> - if ((strstr(device->name, "dag")) || (strstr(device->name, "napa"))) {
> 
> + if (strstr(device->name, "dag") || strstr(device->name, "nap") ||
> strstr(device->name, "dna") || (strstr(device->name, "eth") &&
> strstr(device->name, "@"))) {
> 
> I¹ve also tried:
> 
> + if ((strstr(device->name, "dag")) || (strstr(device->name, "nap")) ||
> (strstr(device->name, "dna")) || (strstr(device->name, "eth") &&
> strstr(device->name, "@"))) {
> 
> 
> As I wasn¹t sure if the paren the strstr statements had to be enclosed in
> their own set of parens.  Anyway, in both instances, I¹ll try to run Argus
> and wind up with a 128 byte file.  For example:
> 
> $ argus -i dna0 -w /var/data/argus-out -s 1500
> (wait about 20 seconds)
> $ ls -l /var/data
> -rw-r--r--. 1 argus argus 128 Mar 28 07:46 argus-out
> 
> When I run with the vanilla drivers, and my interface is not ³dna0² but
> ³em1², then I get better results.
> 
> # rmmod ixgbe
> # modprobe ixgbe #pulling from /lib/modules/`uname -r`
> 
> $ rm argus-out
> rm: remove regular file `argus-out'? y
> $ argus -i em1 -w /var/data/argus-out -s 1500
> (wait about 20 seconds)
> $ ls -l /var/data
> -rw-r--r--. 1 argus argus 2392260 Mar 28 07:46 argus-out
> 
> 
> The real kicker seems to be in /var/log/messages.  When running argus on
> em1 with the original ixgbe driver, I get the following output in
> /var/log/messages:
> 
> 
> Mar 28 05:14:52 argus argus[23142]: 28 Mar 14 05:14:52.865660 started
> Mar 28 05:14:52 argus argus[23142]: 28 Mar 14 05:14:52.882755 started
> Mar 28 05:14:52 argus kernel: device em1 entered promiscuous mode
> Mar 28 05:14:52 argus argus[23142]: 28 Mar 14 05:14:52.932220
> ArgusGetInterfaceStatus: interface em1 is up
> Mar 28 05:15:18 argus argus[23142]: 28 Mar 14 05:15:18.812342 stopped
> 
> 
> However, when running with the DNA driver, the output is as follows:
> 
> Mar 28 08:33:16 argus argus[23915]: 28 Mar 14 08:33:16.967530 started
> Mar 28 08:33:16 argus argus[23915]: 28 Mar 14 08:33:16.985055 started
> Mar 28 08:33:50 argus argus[23915]: 28 Mar 14 08:33:50.667199 stopped
> 
> 
> Now the interface is in promiscuous mode, I can see the change in received
> packets rising considerably by just running ifconfig a few times.  I think
> that for whatever reason, the function in Argus that outputs the
> ³ArgusGetInterfaceStatus² line isn¹t correctly interpreting dna0 as an
> appropriate interface.
> 
> Does any of this sound remotely possible?
> 
> -Jeff
> 
> 
> 
>> On 3/27/14, 7:23 PM, "Craig Merchant" <cmerchant at responsys.com> wrote:
>> 
>> Hey, Jeffrey...
>> 
>> The configuration questions for the pf_ring and ixgbe drivers may be
>> better answered on the ntop forums...  But I'll do my best.  Here is how
>> I load the drivers:
>> 
>>   insmod /lib/modules/2.6.32-220.el6.x86_64/updates/pf_ring.ko
>>   /sbin/modprobe ixgbe MQ=0,0 RSS=1,1 num_rx_slots=32768
>> 
>>   ifconfig dna0 up promisc
>>   ethtool -K dna0 tso off
>>   ethtool -K dna0 gro off
>>   ethtool -K dna0 lro off
>>   ethtool -K dna0 gso off
>>   ethtool -G dna0 tx 32768
>>   ethtool -G dna0 rx 32768
>> 
>> One thing I'm not clear on from your config is why you are using
>> pfdnacluster_master at all...  That daemon is designed to split up flows
>> and/or make copies of them to distribute to other applications.  I don't
>> think it's meant to aggregate two interfaces into one stream.  Normally
>> it's run with a -n parameter to tell it how many queues you want traffic
>> divided up into.  We use:
>> 
>> pfdnacluster_master -d -c 10 -n 28,1 -m 0 -i dna0
>> 
>> In this case, -n says "divide up one copy of the traffic into 28 queues"
>> and "create one copy of all the traffic on the last queue".  The apps
>> accessing the first 28 queues (Snort) would connect to dnacluster:10 at 0 -
>> dnacluster:10 at 27.   Argus connects to dnacluster:10 at 28 and would see a
>> copy of all of the traffic.
>> 
>> If all you are looking to do is combine traffic from two interfaces into
>> one, why not just run argus with -i dna0,dna1?
>> 
>> For testing, I would try the following to see where you might be having
>> problems:
>> 
>>    pfcount -i dna0
>>    pfcount -i dna1
>>    pfcount -i dna0,dna1
>>    pfcount -i dnacluster:10
>>    pfcount -i dnacluster:10 at 0
>> 
>> Let me know if that helps...
>> 
>> Craig
>> 
>> 
>> 
>> 
>> -----Original Message-----
>> From: Reynolds, Jeffrey [mailto:JReynolds at utdallas.edu]
>> Sent: Thursday, March 27, 2014 3:18 PM
>> To: Craig Merchant; Carter Bullard
>> Cc: Argus
>> Subject: Re: [ARGUS] Multi-Instanced Argus
>> 
>> So I understand this is from a while ago, but here is what I have.
>> Craig, maybe you can show me how I'm doing it wrong.
>> 
>> I finally got PF_Ring and libzero licensed correctly so that pfdnacluster
>> isn't limited to 5 minutes of capture.  I downloaded the Argus source,
>> installed the dependencies, and compiled after making the change you
>> noted below.  However, I don't seem to be properly attaching argus to my
>> devices to allow it to capture.  I have a feeling its something to do
>> with my PF_Ring or dna-ixgbe conf files.  We have two interfaces to
>> monitor, which I've previously combined into one by using
>> pfdnacluster_master.  However, it looks like I can't get Argus to hook
>> into that or a single dan interface.  Anyway, after make installing, I
>> run the following command with the following result:
>> 
>> #pfdnacluster_master -i dna0,dna1 -c 10
>> #argus -i dnacluster:10 -s 1500 -w /var/data/argus-out
>> 
>> My /var/log/messages says that the specified interface doesn't exist,
>> which I kind of expected.
>> So I tried this (without pfdnacluster running):
>> 
>> #argus -i dna0 -s 1500 -w /var/data/argus-out
>> 
>> This time argus appears to have started, but my output file is not
>> growing (it initial starts at 128 bytes and increases by that same amount
>> every 30 seconds or so).
>> 
>> In case this happens to be the parameters I'm loading with my kernel
>> modules, here they are:
>> 
>> pf_ring.ko transparenet_mode=2
>> (I've also tried 0, with similar results) ixgbe.ko RSS=1,1,1,1 (I wasn't
>> seeing all of the traffic from my interfaces with the default config, the
>> ntop folks recommended this, I need to dig further into the docs to learn
>> more about these parameters).
>> 
>> To answer your original question, I'm only monitoring about ~2Gbps,
>> significantly less then you are.  I'm not sure if what I've noticed would
>> be considered "gaps", but we do see exchanges where the server appears to
>> initiate conversations by sending a response to a client, which the
>> client doesn't appear to have requested.  I'm guess the missing request
>> was most likely a packet that didn't get captured.
>> 
>> Any configuration suggestions would be much appreciated.
>> 
>> Thanks,
>> 
>> Jeff
>> 
>> 
>> From: Craig Merchant
>> <cmerchant at responsys.com<mailto:cmerchant at responsys.com>>
>> Date: Wednesday, March 12, 2014 at 6:39 PM
>> To: Carter Bullard <carter at qosient.com<mailto:carter at qosient.com>>, Jeff
>> Reynolds <jjr140030 at utdallas.edu<mailto:jjr140030 at utdallas.edu>>
>> Cc: Argus 
>> <argus-info at lists.andrew.cmu.edu<mailto:argus-info at lists.andrew.cmu.edu>>
>> Subject: RE: [ARGUS] Multi-Instanced Argus
>> 
>> We're running Argus and Snort of PF_RING's DNA/Libzero drivers.  We
>> decided to use Libzero because the standard DNA drivers limit the number
>> of memory "queues" containing network traffic to 16.  Each queue can only
>> be accessed by a single process and our sensors have 32 cores, so we
>> wouldn't be able to run the maximum number of Snort instances without it.
>> 
>> We use the pfdnaclustermaster app to spread flows across 28 queues for
>> snort and also maintain a copy of all flows in a queue for Argus.
>> 
>> To get it to work, all I had to do was make a slight edit to
>> ArgusSource.c so that Argus would recognize DNA/Libzero queues as a valid
>> interface.
>> 
>> Somewhere around line 4191 (for argus 3.0.7):
>> 
>> 
>> -   if ((strstr(device->name, "dag")) || (strstr(device->name, "napa"))) {
>> 
>> + if (strstr(device->name, "dag") || strstr(device->name, "nap") ||
>> + strstr(device->name, "dna") || (strstr(device->name, "eth") &&
>> + strstr(device->name, "@"))) {
>> 
>> Our data centers do around 4-8 Gbps 24/7.  From what I recall, there is
>> (or was) a bug in PF_RING that caused Argus to run at 100% all of the
>> time, but in my experience Argus wasn't having problems keeping up with
>> our volume of data.  We did see an unusually high number of flows that
>> Argus couldn't determine the direction of, but we weren't seeing gaps in
>> the packets or anything else to suggest that Argus couldn't handle the
>> volume.
>> 
>> How much traffic are you sending at Argus?  Have you tried searching your
>> Argus records for flows that have gaps in them?  That would be a pretty
>> good indicator that Argus may have trouble keeping up.  Or that your SPAN
>> port can't handle the load...
>> 
>> Thx.
>> 
>> Craig
>> 
>> From: 
>> argus-info-bounces+cmerchant=responsys.com at lists.andrew.cmu.edu<mailto:arg
>> us-info-bounces+cmerchant=responsys.com at lists.andrew.cmu.edu>
>> [mailto:argus-info-bounces+cmerchant=responsys.com at lists.andrew.cmu.edu]
>> On Behalf Of Carter Bullard
>> Sent: Wednesday, March 12, 2014 1:57 PM
>> To: Reynolds, Jeffrey
>> Cc: Argus
>> Subject: Re: [ARGUS] Multi-Instanced Argus
>> 
>> Hey Jeffery,
>> Good so far.   This seem like the link for accelerating snort with
>> PF_RING DNA ??
>> http://www.ntop.org/pf_ring/accelerating-snort-with-pf_ring-dna/
>> 
>> I'm interested in the symmetric RSS and if it works properly.
>> Are you running the PF_RING DNA DAQ ????
>> 
>> It would seem that we'll have to modify argus to use this facility ???
>> 
>> Carter
>> 
>> On Mar 12, 2014, at 3:26 PM, Reynolds, Jeffrey
>> <JReynolds at utdallas.edu<mailto:JReynolds at utdallas.edu>> wrote:
>> 
>> 
>> First, before we dive into to it too deep, how is the performance ??
>> 
>> This actually seems like a great place to start.  Before getting too
>> heavy into PF_RING integration, maybe I should offer a bit of backstory.
>> Our main goal is just to archive traffic.  We have a server running
>> CentOS 6 that receives traffic from two SPAN ports.  The only thing we
>> want to accomplish is to maintain a copy of that traffic for some period
>> of time.  Argus was used because it seemed to be the best tool for the
>> price, and it comes with a lot of great features that while we may not
>> use now, we may use later (again, for right now all we want is a copy of
>> the traffic to be able to perform forensics on).
>> 
>> Now, I put up a single instance of Argus and pointed it at the interface
>> that was the master of our two bonded physical NICs (eth0 and eth1 are
>> bonded to bond0).  I let it run for an hour to get some preliminary
>> numbers.  I ran an recount against my output file and got the following
>> stats:
>> 
>> racount -t 2014y3m12d05h -r argus-out
>> racount records total_pkts src_pkts dst_pkts total_bytes src_bytes
>> dst_bytes sum 14236180 187526800 98831765 88695035 212079839908
>> 102889789820 109190050088
>> 
>> However, the switch the switch sending that traffic reported that it had
>> sent a total of 421,978,297 packets to both interfaces, and a total of
>> 371,307,051,815 bytes for that time frame.  I could be interpreting
>> something incorrectly, so maybe the best first thing for me to confirm is
>> that we are in fact losing a lot of traffic.  But it seems that a single
>> argus instance can't keep up with the traffic.  I've seen this happen
>> with Snort, and our solution was to plug Snort into PF_RING to allow the
>> traffic to be intelligently forwarded via the Snort Data Acquisition
>> Library (DAQ).  From the perspective of someone who hasn't had a lot of
>> exposure to this level of hardware configuration, it was relatively easy
>> to plug the configuration parameters in at the Snort command line to have
>> them all point at the same traffic source so that each individual process
>> didn't run through the same traffic.  My hope was that there might just
>> be some parameters to set within the argus.conf file which would tell
>> each process to pull from a single PF_RING source.  However, it looks
>> like this might not be as easy as I had once thought.
>> 
>> Am I on the right track or does this make even a little sense?
>> 
>> Thanks,
>> 
>> Jeff
>> 
>> 
>> 
>> From: Carter Bullard
>> <carter at qosient.com<mailto:carter at qosient.com><mailto:carter at qosient.com>>
>> Date: Wednesday, March 12, 2014 at 9:54 AM
>> To: "Reynolds, Jeffrey"
>> <JReynolds at utdallas.edu<mailto:JReynolds at utdallas.edu><mailto:JReynolds at ut
>> dallas.edu>>
>> Cc: Argus 
>> <argus-info at lists.andrew.cmu.edu<mailto:argus-info at lists.andrew.cmu.edu><m
>> ailto:argus-info at lists.andrew.cmu.edu>>
>> Subject: Re: [ARGUS] Multi-Instanced Argus
>> 
>> Hey Jeffrey,
>> I am very interested in this approach, but I have no experience with this
>> PF_RING feature, so I'll have to give you the "design response".
>> Hopefully, we can get this to where its doing exactly what anyone would
>> want it to do, and get us a really fast argus, on the cheap.
>> 
>> First, before we dive into to it too deep, how is the performance ??  Are
>> you getting bi-directional flows out of this scheme ??  Are you seeing
>> all the traffic ???  If so, then congratulations !!!  If the performance
>> is good, your seeing all the traffic, but you're only getting
>> uni-directional flows, then we may have some work to do, but still
>> congratulations !!!  If you're not getting all the traffic then we have
>> some real work to do, as one of the purposes of argus is to monitor all
>> the traffic.
>> 
>> OK, so my understanding is that the PF_RING can do some packet routing to
>> a non-overlapping set of tap interfaces.  Routing is based on some
>> classification scheme, designed to make this usable. The purpose is to
>> provide coarse grain parallelism for packet processing.  The idea, as
>> much as I can tell, is to prevent multiple readers from having to read
>> from the same queue; eliminating locking issues, which kills performance
>> etc...
>> 
>> So, I'm not sure what you mean by "pulling from the same queue".  If you
>> do have multiple argi reading the same packet, you will end up counting a
>> single packet multiple times.  Not a terrible thing, but not recommended.
>> Its not that you're creating multiple observation domains using this
>> PF_RING technique. You're really splitting a single packet observation
>> domain into a multi-sensor facility ... eventually you will want to
>> combine the total argus output into a single output stream, that
>> represents the single packet observation domain.  At least that is my
>> thinking, and I would recommend that you use radium to connect to all of
>> your argus instances, rather than writing the argus output to a set of
>> files.  Radium will generate a single argus data output stream,
>> representing the argus data from the single observation domain.
>> 
>> The design issue of using the PF_RING function is "how is PF_RING
>> classifying packets to do the routing?".
>> We would like for it to send packets that belong to the same
>> bi-directional flow to the same virtual interface, so argus can do its
>> bi-directional thing.  PF_RING claims that you can provide your own
>> classifier logic, which we can do to make this happen.  We have a pretty
>> fast bidirectional hashing scheme which we can try out.
>> 
>> We have a number of people that are using netmap instead of PF_RING.  My
>> understanding is that it also has this same type of feature.  If we can
>> get some people talking about that, that would help a bit.
>> 
>> Carter
>> 
>> 
>> 
>> On Mar 12, 2014, at 1:03 AM, Reynolds, Jeffrey
>> <JReynolds at utdallas.edu<mailto:JReynolds at utdallas.edu><mailto:JReynolds at ut
>> dallas.edu>> wrote:
>> 
>> Howdy All,
>> 
>> So after forever and a day, I've finally found time to start working on
>> my multi-instanced argus configuration. Here is my setup:
>> 
>> -CentOS 6.5 x64
>> -pfring driver compiled from source
>> -pfring capable Intel NICs (currently using the ixgbe driver version
>> 3.15.1-k) (these NICs are in a bonded configuration under a device named
>> bond0)
>> 
>> I've configured my startup script to start 5 instances of Argus, each
>> with there own /etc/argusX.conf file (argus1.conf, argus2.conf, etc).
>> The start up script correctly assigns the proper pid file to each
>> instance, and everything starts and stops smoothly.  Each instance is
>> writing an output file to /var/argus in the format of argusX.out.  When I
>> first tried running my argus instances, I ran them with a version of
>> PF_RING I had installed from an RPM obtained from the ntop repo.  Things
>> didn't seem to work correctly, so I tried again after I had compiled from
>> source.  After compiling from source, I got the following output in
>> /var/log/messages when I started argus:
>> 
>> Mar 11 17:48:16 argus kernel: No module found in object Mar 11 17:49:16
>> argus kernel: [PF_RING] Welcome to PF_RING 5.6.3 ($Revision: 7358$) Mar
>> 11 17:49:16 argus kernel: (C) 2004-14
>> ntop.org<http://ntop.org/><http://ntop.org<http://ntop.org/>>
>> Mar 11 17:49:16 argus kernel: [PF_RING] registered /proc/net/pf_ring/ Mar
>> 11 17:49:16 argus kernel: NET: Registered protocol family 27 Mar 11
>> 17:49:16 argus kernel: [PF_RING] Min # ring slots 4096
>> Mar 11 17:49:16 argus kernel: [PF_RING] Slot version     15
>> Mar 11 17:49:16 argus kernel: [PF_RING] Capture TX       Yes [RX+TX]
>> Mar 11 17:49:16 argus kernel: [PF_RING] Transparent Mode 0
>> Mar 11 17:49:16 argus kernel: [PF_RING] IP Defragment    No
>> Mar 11 17:49:16 argus kernel: [PF_RING] Initialized correctly Mar 11
>> 17:49:35 argus kernel: Bluetooth: Core ver 2.15 Mar 11 17:49:35 argus
>> kernel: NET: Registered protocol family 31 Mar 11 17:49:35 argus kernel:
>> Bluetooth: HCI device and connection manager initialized Mar 11 17:49:35
>> argus kernel: Bluetooth: HCI socket layer initialized Mar 11 17:49:35
>> argus kernel: Netfilter messages via NETLINK v0.30.
>> Mar 11 17:49:35 argus argus[13918]: 11 Mar 14 17:49:35.643243 started Mar
>> 11 17:49:35 argus argus[13918]: 11 Mar 14 17:49:35.693930 started Mar 11
>> 17:49:35 argus kernel: device bond0 entered promiscuous mode Mar 11
>> 17:49:35 argus kernel: device em1 entered promiscuous mode Mar 11
>> 17:49:35 argus kernel: device em2 entered promiscuous mode Mar 11
>> 17:49:35 argus argus[13918]: 11 Mar 14 17:49:35.721490
>> ArgusGetInterfaceStatus: interface bond0 is up Mar 11 17:49:36 argus
>> argus[13922]: 11 Mar 14 17:49:36.349202 started Mar 11 17:49:36 argus
>> argus[13922]: 11 Mar 14 17:49:36.364625 started Mar 11 17:49:36 argus
>> argus[13922]: 11 Mar 14 17:49:36.383623 ArgusGetInterfaceStatus:
>> interface bond0 is up Mar 11 17:49:37 argus argus[13926]: 11 Mar 14
>> 17:49:37.045224 started Mar 11 17:49:37 argus argus[13926]: 11 Mar 14
>> 17:49:37.060689 started Mar 11 17:49:37 argus argus[13926]: 11 Mar 14
>> 17:49:37.079706 ArgusGetInterfaceStatus: interface bond0 is up Mar 11
>> 17:49:37 argus argus[13930]: 11 Mar 14 17:49:37.753278 started Mar 11
>> 17:49:37 argus argus[13930]: 11 Mar 14 17:49:37.768613 started Mar 11
>> 17:49:37 argus argus[13930]: 11 Mar 14 17:49:37.785691
>> ArgusGetInterfaceStatus: interface bond0 is up Mar 11 17:49:38 argus
>> argus[13934]: 11 Mar 14 17:49:38.449229 started Mar 11 17:49:38 argus
>> argus[13934]: 11 Mar 14 17:49:38.466365 started Mar 11 17:49:38 argus
>> argus[13934]: 11 Mar 14 17:49:38.485675 ArgusGetInterfaceStatus:
>> interface bond0 is up
>> 
>> Aside from the "No module found in object" error, everything seems like
>> its working Ok.  The only problem is that I don't seem to have my argus
>> instances configured to pull traffic from the same queue.  In other
>> words, I have five output files from five argus instances with like
>> traffic in all of them.  I haven't made any changes to my argus config
>> files, aside from telling them to write to different locations and the
>> name of the interface. I know I'm missing something but I'm not quite
>> sure what it is.  If someone might be able to tell me how to configure
>> these five instances to pull from the same PF_RING queue, I'd be mighty
>> obliged.  Let me know if I need to submit any additional information.
>> 
>> Thanks,
>> 
>> Jeff Reynolds
> 
> 



More information about the argus mailing list