argus-3.x request (forwarded)

Carter Bullard carter at qosient.com
Mon Jan 25 12:54:48 EST 2010


Hey Martin,

Problem is the duplicates occur for all protocols, so when there are
duplicates, you end up with counter problems for all the flows, etc... 
If we're going to fix it, I think we should fix it well.

So, some kernels zero out the ip_id if there aren't fragments, so the ip_id
isn't a reliable pseudo-counter.

The duplicate, as I suggested, isn't always an exact copy,  (say if you
port mirror the egress port of the router, rather than the ingress port you're
connected to).   The TTL's will have changed, the L2 addresses will be
completely different etc..... 

The real solution is to collect the port mirror on another down interface, rather
that the interface you are using to support real traffic.  I know, not always
possible.

So, with regard to your curious question, you are starting to go down the
IPFIX template path, which I think is a bad thing to do, from the perspective
of the sensor.  To reduce the volume of traffic you want to do some
form of data reduction, using index values rather than MAC addresses,
or IP addresses, or 5-tuples, or ....  This generates a lot of complexity for
the sensor, just for the sake of the transport load.  This slows the sensor
jeopardizing it keeping up with line rate and that is opposite the argus
strategy.  If you want to compress the data, I say, do it outside the sensor.
No problem.


Carter

On Jan 25, 2010, at 12:10 PM, Martin Xxxxxx wrote:

>> Hey Martin,
>> You should be using argus-3.0!!!  Or at least be playing with it ;o)
> 
> Yes I know. :-)
> We're upgrading to 64bit "soon" and then we'll upgrade all argus daemons and
> ra*-utilities.
> 
>> Could you resend this response to the argus mailing list, or can I?
> 
> You go ahead and resend the response (or an altered version)!
> (please remove my email adress)
> 
>>  1.  Duplicate packets
>>           So this is an interesting problem because there is very little
>> information in
>>           the packet to help you differentiate between duplicates and
>> real retransmissions.
>>           They aren't actually always identical packets, as the L2
>> information or TTLs maybe
>>           off, as the mirror may not be on the first hop link/interface.
>> 
>>           There are a few ideas.  The best would be to reject packets
>> with different L2/tunnel
>>           id's but identical IP (ignoring the TTL) and transport
>> identifiers (especially any sequence
>>           numbers) that arrive less than one RTT for the flow.  The RTT
>> can be determined,
>>           when possible, and we can come up with reasonable default
>> values (< 100uSec).
>> 
>>           Currently, argus only has cached information from the last
>> packet  seen in either
>>           direction.  So if the packet train is something like this:
>> 
>>               1, 1, 2, 2, 3, 3, 4, 4
>> 
>>          we can figure it out.  But if its something like this:
>>               1, 2, 3, 1, 2, 3,  4, 5, 6, 4, 5, 6
>> 
>>          then a simple strategy is not going to do it for us.
> 
> In 99.999% of the time it is case 1, where the duplicates come in order. 
> 
> Isn't it easy to spot the difference between a retransmission and a
> duplicate copy?
> A retransmission has to be a TCP/IP packet (so argus need not look at any
> other protocol that might normally send identical packets over and over
> again like some LLC frames).
> A retransmission is a completely new packet with a new IP ID but with the
> same TCP seq/ack+data, while a duplicate is an exact copy of the previous
> packet.
> 
> 
>>          Programs like editcap() attempt to remove duplicates by keeping
>> an MD5 cksum of the
>>          last 4 packets in a cache and rejecting matches.  This is
>> doable, but it also is simple
>>          and would have some issues.
> 
> Yes, instead of holding a complete copy of the last packet and compare 100%
> of it with the new it might be faster to compare specific fields or a
> computed checksum for the entire packet.
> 
>>          The trick will be to inspect a bunch of packet files that
>> capture this situation and check
>>          to see how best to identify the duplicates.  I can start
>> working on this problem now, if
>>          we have the files.
> 
> I'll see if I can generate a couple of pcap-files for you tomorrow.
> 
> 
>>  2.  DNS transaction capture
>>          You can do this today with argus.  A user data buffer capture
>> of 256 will capture all
>>           the data needed to do what you want with DNS, and the program
>> radump() will
>>           printout all the DNS information you need to do the tracking.
>> The problem is that
>>           with this strategy you capture 256 bytes of every flow, and
>> that maybe an issue for
>>           some sites.
> 
> Ah, yes that would fill the HDDs quite quickly.
> 
> ...or wait, did you say flow.   Will only 256 bytes (from the first few
> packets) be logged? Doh! I assumed it acted like 'tcpdump -s 256'.
> 
>>          Item #3 of work items for 2010, mentions control plane flow
>> monitoring.
>>          DNS is THE internet Call Control protocol.  (see slides 35 and
>> 36 of the FloCon
>>          argus 1/2 day tutorial, Introduction to Argus).
>> 
>>           So, we'll have specific support for DNS tracking in argus-
>> 3.0.4.  What this
>>          really means is that we will capture all the payload data in
>> the control plane flow,
>>          DNS included (also DHCP, ARP, STP, RIP, OSPF, ISIS, BGP, SIP,
>> RSVP) , so you
>>          don't have to grab 256 bytes of every transaction to get the
>> control plane flows to grab
>>          what you want.
> 
> Interesting!
> Curious question:
> How do you store repeated information? Example:
> If you have a link network with only an ISP router and a customer firewall,
> the sniffer will only see two physical MAC addresses. Then there's no point
> in saving the src and dst mac for every single flow, but rather have a
> lookup-table of Mac-addresses to which every flow can have a pointer.
> I believe all protocols you list above have very repeated information.
> 
>> With regard to your perfect world, I agree, and the approach is that those
>> jobs (correlation
>> between flow information elements) are the jobs of argus clients and
>> information systems.
>> If argus is doing the good job, then the data is captured, but external
>> programs are
>> needed to track this information.  I do this with DHCP data, but where the
>> IP address user
>> mappings come from is usually an information system outside the
>> observation domain of
>> argus.
> 
> Yepp, I do the correlation externally too.
> 
> 
> Thanks for a great tool and for your quick response!
> 
> /Martin
> 
> 
>> On Jan 25, 2010, at 9:01 AM, Martin Xxxxxx wrote:
>> 
>>> 
>>>> Subject: [ARGUS] flocon 2010 presentations on the web
>>>> From: Carter Bullard <carter at qosient.com>
>>>> To: Argus <argus-info at lists.andrew.cmu.edu>
>>>> Date: Fri, 22 Jan 2010 14:00:43 -0500
>>>> 
>>>> Gentle people,
>>>> I've updated the argus home page and I've put a list of what I was
>> going
>>>> to do for version 3.0.4.  If you have any ideas, I'd love to include
>> them!!!
>>> 
>>> Hi Carter!
>>> 
>>> Two things I've been missing in my argus data:
>>> 
>>> 1.
>>> You already have:
>>>            s      -  Src TCP packet retransmissions
>>>            d      -  Dst TCP packet retransmissions
>>>            *      -  Both Src and Dst TCP retransmissions
>>> 
>>> I would like argus to distinguish between retransmissions and duplicate
>> copies of a frame.
>>> 
>>> Why, you ask?
>>> Well, because it is very common that customers setup faulty SPAN
>> mirroring. So the sensor (i.e. argus) receive two identical copies of a
>> frame.
>>> (In HP procurve switches, it is even "common" to have one copy of
>> packets in one direction but two copies in the other...)
>>> 
>>> The problem is how the switches deal with "in", "out", "both" mirroring
>> and VLAN-mirroring (opposite to port mirroring).
>>> 
>>> 
>>> Right now the unwanted extra copies register as "retransmissions" even
>> though no TCP retransmission has occurred.
>>> 
>>> I would like Argus to be able to distinguish between the two scenarios
>> so it don't give false retransmission statistics and to help me spot
>> customers with a faulty SPAN setup.
>>> 
>>> 
>>> 
>>> 2.
>>> I would like argus to store all DNS requests and/or responses
>> (configurable).
>>> This way I would have a database of requested hostnames which can be
>> used to:
>>> * match lookups against a database of known bad hostnames/strings
>>> * afterwards be able to figure out the actual hostname of a web server
>> without the payload from the GET request header (the "Host:" line).
>>> 
>>> 
>>> (I currently use Argus 2.x, so if any of the above is already invented,
>> I'm sorry to have wasted your time with this email :-)   )
>>> 
>>> 
>>> /Martin
>>> 
>>> 
>>> 
>>> PS.
>>> In a perfect world, I would like argus to be able to keep state of the
>> "identity" behind IPs. I.e. argus should know how to decode specific
>> protocols and look for data that might identify this IP (apart from the
>> current IP and Mac).
>>> Example:
>>> From Windows NetBIOS packets you can get the hostname and MAC address of
>> an IP (get MAC even if the sensor is not located on the same segment as
>> the client).
>>> From NetBIOS/SMB packets you can get usernames, this is usually nice
>> information to have when trying to determine "who did the p2p
>> filetransfer" or whatever.
>>> From DNS responses you can get hostnames for IPs.
>>> From DHCP/bootp you can get hostnames for IPs.
>>> 
>>> Apart from the vast work of developing all the protocol decoding needed,
>> you would also need a smart way to store changes in Identification, and
>> even harder - methods to query this information based on time.
>>> 
>> 
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20100125/b1ec5f53/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3815 bytes
Desc: not available
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20100125/b1ec5f53/attachment.bin>


More information about the argus mailing list