Couple things...

Carter Bullard carter at qosient.com
Sat Aug 10 10:59:27 EDT 2013


Hey Craig,
In the packet file you sent me, there is evidence of sizable packet loss from the wire to argus.  The biggest indicator is the occurence of the " g " indicator in primitive argus data.  The " g "  indicates gaps, which indicates that packets were missing, not retansmitted, not dropped, just missing.  Seems in your file about 20-30% of the TCP flows had gaps, if memory serves.  That's significant.  You can print the gap value to see how many gaps argus saw.  Should be a byte value.  If you see a bunch of " g " in your flgs field, you've got a packet capture problem.

The packet acquisition system can fail at any point along its path.
It can be pretty complicated, like pf_ring screwing up, egress interface overruns, hardware and interrupt problems, etc....  But you aren't going that fast si the gigamon should make a huge improvement iver the 6500's.  Even 6590's have big port mirroring issues.  You don't notice it until you put a sensitive sensor on the end of the packet path.

Carter


On Aug 10, 2013, at 10:21 AM, Carter Bullard <carter at qosient.com> wrote:

> I'm not a fan of VLAN spanning, I like physical interface port mirroring or wire tapping, as that gives you the complete set of data to do all things that you need to do for ops, performance, and security.  The gigamon approach will do much better.  You want to see the encapsulations on the wire, you want to see the vlan tags, you need to see what is hitting that switch/router/host interface if you want to solve problems.
> 
> You don't want to dedup flows...That is a pretty big no-no from the argus perspective.  We are all about comprehensive monitoring.  If you toss things out, you're not comprehensive.  The reason you want to see everything, is so you can realize when things break/change, that is the operational side, but more important on the security side, you don't want to give an intruder a space to operate in.  If you throw stuff away, one can hide actions in the discarded space.  Especially non-IP traffic on the wire, if you are not monitoring Layer 2 non-IP flows, you're in big trouble of missing how you're getting trashed !!!
> 
> Carter
> 
> 
> On Aug 9, 2013, at 11:34 PM, Craig Merchant <cmerchant at responsys.com> wrote:
> 
>> Hey, Carter…  So, I sent your email to the NetOps team.  Apparently the duplicate packet problem is well known when you use VLANs as the source of traffic for a SPAN port:
>>  
>> http://blogs.cisco.com/security/span-packet-duplication-problem-and-solution/
>>  
>> I searched over the last 30 million of our flow data events (from racluster) looking for TCP flags with “M” in it.  Not a one.  So, I can’t explain why you saw that in the tcpdump file I sent you, but it doesn’t appear to be a pervasive problem.
>>  
>> We’re going to contact our reps at Gigamon and see whether their products’ flow dedup features might  help. 
>>  
>> If Argus is seeing duplicate flows, shouldn’t it see duplicate SYN and SYN/ACKs?  And if so, shouldn’t it be able to figure out the direction?  I’m still trying to figure out if the problem is a sampling issue with a really busy SPAN port or if it’s something else…
>>  
>> We’re going to try dropping half the VLANs to reduce the traffic volume and see if that has any impact.
>>  
>> Thanks. 
>>  
>> Craig
>>  
>> From: Carter Bullard [mailto:carter at qosient.com] 
>> Sent: Thursday, August 08, 2013 11:00 AM
>> To: Craig Merchant
>> Cc: Argus (argus-info at lists.andrew.cmu.edu)
>> Subject: Re: [ARGUS] Couple things...
>>  
>> Hey Craig,
>> I've been looking at one of your pcap files, and you've go a lot of weird
>> stuff going on in your network.  You, like a few on the list, have an
>> observation domain that sees many packets twice.  While some say
>> these are " duplicates ", they are distinct packets on the wire, and seeing
>> them twice is really an artifact of either how your spanning the packets,
>> or how you've set up your network.
>>  
>> Possibly you are spanning multiple interfaces to the same argus , and
>> the packet traverses both of them ??   Or the packet actually traverses
>> the same physical link twice, but in different overlays or VPNs ???
>>  
>> Argus can be configured to make these mulitple flows distinct, rather than
>> having the packets aggregated into a single  5-tuple flow record.  You can
>> do that by adding the mac addresses to the flow key, or the VLAN tags
>> to to the flow key.
>>  
>> I can suggest that you add the LAYER_2 information to argus's flow keys,
>> to see if you don't get a bit better data.   In your argus.conf file:
>>  
>>    ARGUS_FLOW_KEY="CLASSIC_5_TUPLE+LAYER_2"
>>  
>> You will need to check the output, so that you can see what is going on.
>> Post processing of these flows, especially aggregation, will need to account
>> for the ethernet addresses (by adding the smac and dmac to the aggregation
>> keys), with calls such as:
>>  
>>    racluster -m smac dmac saddr daddr proto sport dport 
>>  
>> when you want to do default aggregation.
>>  
>> Here is a sample of one of your pings, between two of your hosts, with the old and new flow key definitions.
>> I've modified the network and ethernet addresses to protect the innocent.
>>  
>> Standard default 5-tuple flow key
>>  
>> ra -r argus.10*old.out -s stime dur flgs smac dmac proto saddr dir daddr spkts dpkts state - icmp 
>>       StartTime        Dur      Flgs             SrcMac             DstMac  Proto       SrcAddr   Dir     DstAddr  SrcPkts  DstPkts State 
>> 20:00:22.187940   0.005851  M         00:30:48:aa:bb:cc  00:1e:f7:xx:yy:zz   icmp   10.30.80.41   <->  10.20.2.26        1        2   ECO
>>  
>>  
>> Standard default 5-tuple flow key with layer 2 identifiers added
>>  
>> ra -r argus.10*new.out -s stime dur flgs smac dmac proto saddr dir daddr spkts dpkts state - icmp 
>>       StartTime        Dur      Flgs             SrcMac             DstMac  Proto       SrcAddr   Dir     DstAddr  SrcPkts  DstPkts State 
>> 20:00:22.187940   0.005851  e         00:30:48:aa:bb:cc  00:1e:f7:xx:yy:zz   icmp   10.30.80.41   <->  10.20.2.26        1        1   ECO
>> 20:00:22.193790   0.000000  e         00:1e:f7:xx:yy:zz  d4:8c:b5:cc:dd:ee   icmp   10.30.80.41   <-   10.20.2.26        0        1   ECR
>>  
>>  
>> As you can see argus, with just the standard 5-tuple flow key, thinks there are 3 packets in the
>> ping volley, one ping request and 2 ping replys.  With the LAYER_2 id's added to the flow key,
>> we see that one of the echo reply's was also transmitted to another ethernet address ???
>> The 'M' in the flgs field of the 5-tuple flow record, indicates that there were 'M'ultiple mac addresses
>> seen for the bi-directional flow.  You don't see that in the new flow key strategy.
>>  
>> I don't see a trend, but you do have a lot of asymmetry in how the packets are duplicated.
>> Take a look at this new flow data, print out the smac and dmac, and see if you can figure it out.
>>  
>> Hope all is most excellent,
>>  
>>  
>> Carter
>>  
>>  
>> On Aug 8, 2013, at 10:23 AM, Carter Bullard <carter at qosient.com> wrote:
>> 
>> 
>> Done some testing with your argus.conf file.
>>  
>> Can't find any argus faults using it with your packet files,
>> on my machines, but there is one issue with the configuration.
>>  
>> Your ARGUS_MONITOR_ID is inappropriate.  You only get 32-bits
>> for a source id, so the max string you can use is 4 characters
>> long.  We'll cut it to 4 chars, so I don't think that this
>> will cause problems, but it is incorrect.
>>  
>> Carter 
>> 
>>  
>> On Aug 7, 2013, at 1:40 PM, Craig Merchant <cmerchant at responsys.com> wrote:
>> 
>> 
>> That worked!  Thanks, David.  Not sure what in my argus.conf could be causing the problem.  Here it is if you’re curious:
>>  
>> ARGUS_FLOW_TYPE="Bidirectional"
>> ARGUS_FLOW_KEY="CLASSIC_5_TUPLE"
>> ARGUS_DAEMON=no
>> ARGUS_MONITOR_ID="ids01-dc1"
>> ARGUS_ACCESS_PORT=561
>> ARGUS_BIND_IP="10.10.10.10"
>> ARGUS_INTERFACE=dnacluster:10 at 28
>> ARGUS_GO_PROMISCUOUS=no
>> ARGUS_SET_PID=yes
>> ARGUS_PID_PATH="/var/run"
>> ARGUS_FLOW_STATUS_INTERVAL=5
>> ARGUS_IP_TIMEOUT=900
>> ARGUS_TCP_TIMEOUT=1800
>> ARGUS_GENERATE_RESPONSE_TIME_DATA=yes
>> ARGUS_GENERATE_APPBYTE_METRIC=yes
>> ARGUS_GENERATE_TCP_PERF_METRIC=yes
>> ARGUS_GENERATE_BIDIRECTIONAL_TIMESTAMPS=yes
>> ARGUS_CAPTURE_DATA_LEN=10
>> ARGUS_SELF_SYNCHRONIZE=yes
>> ARGUS_KEYSTROKE="yes"
>>  
>> From: David Edelman [mailto:dedelman at iname.com] 
>> Sent: Tuesday, August 06, 2013 8:42 PM
>> To: Craig Merchant; Carter Bullard
>> Cc: Argus (argus-info at lists.andrew.cmu.edu)
>> Subject: Re: [ARGUS] Couple things...
>>  
>> Craig,
>>  
>> Just in case you are running into something odd in the argus.conf file, I suggest that you add –X as the very first argument to the invocation of argus. I suggest something very simple like:
>>  
>> # /usr/local/bin/argus –X –r somefile.pcap –w /tmp/somefile.argus 
>>  
>> If that works (and /tmp is almost always a good place to write the output because it avoids permission problems) then use recount() on the /tmp/somefile.argus to make sure that everything is as expected and let us know what happened.
>>  
>> --Dave
>>  
>>  
>> From: Craig Merchant <cmerchant at responsys.com>
>> Date: Tuesday, August 6, 2013 11:28 PM
>> To: Carter Bullard <carter at qosient.com>
>> Cc: Argus <argus-info at lists.andrew.cmu.edu>
>> Subject: Re: [ARGUS] Couple things...
>>  
>> I don’t know what to tell you.  If you want me to run that trace tool and send you the output, let me know where to get it and I’ll figure it out.
>>  
>> Did you take a look at the pcap file to see if there were a lot of missing SYN/SYNACK packets? 
>>  
>> Thanks.
>> 
>> Craig
>>  
>> From: Carter Bullard [mailto:carter at qosient.com] 
>> Sent: Tuesday, August 06, 2013 10:02 AM
>> To: Craig Merchant
>> Cc: Argus (argus-info at lists.andrew.cmu.edu)
>> Subject: Re: [ARGUS] Couple things...
>>  
>> Hey Craig,
>> I'm not having any problems reading your tcpdump.pcap file
>> with my version of argus, so I can't reproduce a fault.
>>  
>> % thoth:Data carter$ argus -r tcpdump*pcap -w - | racount
>> racount   records     total_pkts     src_pkts       dst_pkts       total_bytes        src_bytes          dst_bytes
>>     sum   402665      9999999        5205934        4794065        4795152829         2664296730         2130856099 
>>  
>> Is there a specific feature or command line option that generates
>> your problem?
>>  
>> Carter
>> 
>>  
>> On Aug 3, 2013, at 2:23 PM, Carter Bullard <carter at qosient.com> wrote:
>> 
>> 
>> 
>> 
>> OK, with the pcap we'll figure it out.
>>  
>> So the ssh keystroke algorithm is round trip sensitive, and its tuned for the enterprise border viewing, but there are a lot of knobs that can be turned.  The real trick is having, again, a packet file of a session so we can see what the algorithm is doing.
>>  
>> Grab a few and we can go over it packet for packet.
>>  
>> Carter
>> 
>> Carter Bullard, QoSient, LLC
>> 150 E. 57th Street Suite 12D
>> New York, New York 10022
>> +1 212 588-9133 Phone
>> +1 212 588-9134 Fax
>> 
>> On Aug 2, 2013, at 3:06 PM, Craig Merchant <cmerchant at responsys.com> wrote:
>> 
>> I don’t know what to tell you, Carter.  The version of 3.0.7.4 that I’m running has the same MD5 sum as the latest in qosient.com/dev…
>>  
>> I’ve uploaded the pcap file I’m trying to convert to your FTP server. 
>>  
>> I’ve attached the debug file, but after further testing I think it’s an algorithm configuration issue.  I’ve tried testing normal and reverse keystroke detection between hosts that were in the same data center and dnstroke and snstroke always show up as “0,0” or “,,” (the latter happens more when there are directional issues).  But if I watch a host that I ssh into over the VPN from my home connection, Argus detects keystrokes. 
>>  
>> I’ve tried reading through the academic paper you guys published on the keystroke detection and it’s beyond me.  If it works for a slower network connection and not a faster network connection (or maybe I should say lower/higher latency connection), which configuration options should I experiment with to find the right balance?
>>  
>> Thanks.
>> 
>> Craig
>>  
>>  
>>  
>>  
>> From: Carter Bullard [mailto:carter at qosient.com] 
>> Sent: Friday, August 02, 2013 8:37 AM
>> To: Craig Merchant
>> Cc: Argus (argus-info at lists.andrew.cmu.edu)
>> Subject: Re: [ARGUS] Couple things...
>>  
>> Hey Craig,
>> Was in Calif all last week, and just now catching up.
>>  
>> I really think the argus crashing issue is fixed.  At least
>> it works with all data that has been uploaded.  But if you have
>> packet data that is blowing argus up, can you send ???
>>  
>> There is a possibility that you may not have the most recent
>> version of argus-3.0.7.4.  I sometimes put up new software
>> without changing the number, like if I make a mistake and
>> put up the wrong version.  So, there could be a race condition.
>> Check the md5 or date times, or just grab again, if there is
>> any doubt.
>>  
>> You have to turn on keystroke detection, so, don't comment out
>> the ARGUS_KEYSTROKE="yes" line.  The CONF line you can comment
>> out.
>>  
>> To troubleshoot the keystroke algorithm, with argus running, but
>> not as a daemon, you can send a USR1 signal to it,
>>  
>>    # kill -USR1 argus.pid
>>  
>> and it will print out stats that include the keystroke algorithm
>> configuration, if its turned on. When you send a USR1 signal to
>> argus, you increment the Debug flag setting for all of argus, and
>> so you should start getting debug messages, if the debug facility
>> is compiled in. Send another USR1 and you'll increase the debug
>> information.  Most of the per packet keystroke debugging is at
>> debug level 5. 
>>  
>> Send a USR2 signal to argus ( # kill -USR2 argus.pid ) to turn
>> debug reporting off.
>>  
>> Carter
>>  
>>  
>> On Aug 1, 2013, at 7:02 PM, Craig Merchant <cmerchant at responsys.com> wrote:
>> 
>> 
>> 
>> 
>> 
>> Hey, Carter…
>>  
>> I just wanted to check in and see if you anything else from me on the labeling issue or argus crashing when trying to convert a pcap file.  Let me know…
>>  
>> I’m also having some issues with keystroke detection with the latest release.  The following command used to work in my testing:
>>  
>> /usr/local/bin/ra -S 10.10.10.10:561 -n -u -c "," -s "+0dnstroke,+1snstroke" - host 10.1.1.1 and host 10.1.1.2
>>  
>> I tried both a normal and reverse SSH session between the two hosts and neither one registered keyboard strokes of varying speeds and intensity.
>>  
>> All I’ve done is commented out the defaults in argus.conf:
>>  
>> ARGUS_KEYSTROKE="yes"
>> ARGUS_KEYSTROKE_CONF="GPC_MAX=4"
>>  
>> I performed pretty much the same testing a couple months ago and got plenty of flows where keystrokes were detected.  Please let me know what you’d recommend for troubleshooting that.
>>  
>> Thanks.
>> 
>> Craig
>>  
>> <debug.zip>
>>  
>>  
>>  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20130810/24fa7cfb/attachment.html>


More information about the argus mailing list