Argus Clients ICMP Parket Printer

Fri Feb 10 12:08:45 EST 2012

Hmmmmm, that shouldn't have changed, but you never know.
I'll put it back.

Carter 

On Feb 9, 2012, at 2:40 PM, Dave Edelman wrote:

> Carter,
>  
> I just installed the 3.0.5.31 clients and there seems to be a difference in the way the ICMP type and reason codes are represented. Originally they were shown as four Hex characters, now they seem to be the decimal equivalent. Is there a switch that I missed (or one that I should have missed)
>  
> --Dave
>  
> From: argus-info-bounces+dedelman=iname.com at lists.andrew.cmu.edu [mailto:argus-info-bounces+dedelman=iname.com at lists.andrew.cmu.edu] On Behalf Of Carter Bullard
> Sent: Tuesday, February 07, 2012 1:27 PM
> To: Marco
> Cc: argus-info at lists.andrew.cmu.edu
> Subject: Re: [ARGUS] Huge argus files and racluster
>  
> Hey Marco,
> Argus is very good at not over or undercounting packets, so don't worry about
> the aggregation model and how it affects accuracy, that has been worked over
> very well.   
>  
> Since you are interested in making sense of it all. 
> You should run racount.1 first.
>  
>    racount -r files -M proto addr
>  
> You should be doing some very large aggregations, such as:
>  
>    racluster -m matrix/16 -r files -s stime dur saddr daddr pkts bytes - ip
>  
> This will show you which CIDR /16 networks are talking to whom.
>  
> If you want to know the list of IP addresses that are active:
>  
>    racluster -M rmon -m saddr -r files -w addrs.out - ip
>  
> Then you can aggregate for the networks, or the countries or whatever:
>    racluster -r addrs.out -m saddr/24 -r files -s stime dur saddr spkts dpkts sbytes dbytes -  ip
>  
> If you want to aggregate based on the country code, you need to use ralabel to set the
> country codes.  Check out 'man ralabel' and 'man 5 ralabel' to see how to do that, and you can
> do that with the IP address file you created above:
>  
>    ralabel -f ralabel.country.code.conf -r addrs,out -w - |  racluster -m sco -w - | \
>       rasort -m sco -v -s stime dur sco spkts dpkts sbytes dbytes
>  
>  
> There are the perl scripts:
>  
>    rahosts -r files
>    raports -r files
>  
> These are pretty informative, and will server you well.  
> That should get you started.
>  
> Carter
>  
>  
> On Feb 7, 2012, at 11:38 AM, Marco wrote:
> 
> 
> Thanks for the detailed answer. I suppose a bit more of background on
> what I'm trying to do is in order here. Basically, I've been handed
> that 50GB pcap monster and been told to "make sense of it".
> Essentially, it contains all the traffic to and from the Internet seen
> on a particular LAN.
> "making sense of it" basically means, in simple terms, finding out:
> 
> - global bandwidth usage (incoming, outgoing)
> - bandwidth usage by protocol (http, smtp, dns, etc.), again incoming
> and outgoing
> - traffic between specific source/destination hosts (possibly
> including detailed protocol usage within that specific traffic)
> 
> Ideally, I'd like to graph some or all of that information, but for
> now I'm ok with running some command line query using racluster/rasort
> to get textual tabular output.
> 
> So, based on what I read, the first thing I was doing was trying to
> summarize the pcap data into an argus file to use as a starting point,
> and that file should ideally include exactly one entry per flow (where
> flow==saddr daddr proto sport dport), because otherwise (if I
> understand correctly) packets, bytes, etc. belonging to a specific
> flow would be counted multiple times, which is not what I want (it's
> entirely possible that I'm misunderstanding how argus works though).
> Note that I'm mostly interested in aggregated numbers here rather than
> detailed flow analysis. For example: I'd like to get all flows where
> the protocol is TCP and dport is 80, then obtain aggregated sbytes and
> dbytes for all those flows. Same for other well-known destination
> ports.
> 
> As it's probably clear by now, I'm a novice to argus, so any help
> would be appreciated (including pointers to examples or other material
> to study). Thanks for your help.
> 
> 2012/2/7 Carter Bullard <carter at qosient.com>:
> 
> Hey Marco,
> Regardless of what time range you work with, there will always be
> a flow that extends beyond that range.  You have to figure out what
> you are trying to say with the data to decide if you need to count
> every connection only once.
>  
> If 5 or 10 or 15 minute files isn't attractive, racluster.1 provides you
> configuration options so you can efficiently track long term flows, but
> it is based on finding an effective idle timeout that will make persistent
> tracking work for your memory limits.  See racluster.5.  Most flows are
> finished in less than a second, and so keeping all of those flows in memory
> is a waste.  Figuring out a good idle timeout strategy, however, is an art.
>  
> By default, racluster's idle timeout is "infinite" and so it holds each flow in
> memory until the end of processing.  If you decide that 600 seconds
> of idle time is sufficient to decide that the flow is done (120 works for
> most, except Windows boxes, which can send TCP Resets for
> connections that have been closed for well over 300 seconds), then
> a simple racluster.conf file of:
>  
> racluster.conf
>    filter="" model="saddr daddr proto sport dport" status=0 idle=600
>  
> may keep you from running out of memory.  If a flow hasn't seen any
> activity in 600 seconds, racluster.1 will report the flow and release
> its memory.
>  
>    racluster -f racluster.conf -r your.files -w single.output.file
>  
> Improving on the aggregation model would include protocol and port
> specific idle time strategies, such as:
>  
> racluster.better.conf
>    filter="udp and port domain" model="saddr daddr proto sport dport" status=0 idle=10
>    filter="udp" model="saddr daddr proto sport dport" status=0 idle=60
>    filter="" model="saddr daddr proto sport dport" status=0 idle=600
>  
> The output data stream of this type of processing will be semi-sorted
> in last time seen order, rather than start time order, so that may be a
> consideration for you.  Sorting currently is a memory hog, so don't
> expect to sort these records after you generate the single output file,
> without some strategy, like using rasplit.1.
>  
> Using state, such as TCP closing state to declare that a flow is done, is
> an attractive approach, but it has huge problems, and I don't recommend it.
>  
> rasqlinsert.1 is the tool of choice if you really would like to have 1 flow
> record per flow, and you're running out of resources.
>  
> Using argus-clients-3.0.5.31 from the developers thread of code,
> use rasqlinsert.1 with the caching option.
>  
>   rasqlinsert -M cache -r your.files -w mysql://user@localhost/db/raOutfile
>  
> This causes rasqlinsert.1 to use a database table as its flow cache.
> Its pretty efficient so its not going to do a database transaction per
> record, if there would be aggregation, so you do get some wins.
> When its finished processing, then create your single file with:
>  
>   rasql -r mysql://user@localhost/db/raOutfile -w single.output.file
>  
>  
> There are problems with any approach that aggregates over long periods
> time, because systems do reuse the 5-tuple flow attributes that make
> up a flow key much faster than you would think.  This results in many situations
> where multiple independent sessions will be reported as a single very
> long lived flow.  This is particularly evident with DNS, where if you aggregate
> over months, you find that you get fewer and fewer DNS transactions (they
> tend to approach somewhere around 32K) between host and server, and
> instead of lasting around 0.025 seconds, they seem to last for months.
>  
> I like 5 minute files, and if I need to understand what is going on just at
> the edge of two 5 minute boundaries, I read them both, and focus on the edge
> time boundary.  Anything longer than that is another type of time domain,
> and there are lots of processing strategies for developing data at that scale,
> that may be useful.
>  
> Carter
>  
>  
> On Feb 7, 2012, at 9:45 AM, Marco wrote:
>  
>  
> Thanks. But what about long-lived flows that last more than 5 minutes?
> Will they be merged or will they appear once per 5-minute file in the
> result? The whole point of clustering is having a single entry for
> each of them, AFAIK.
>  

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20120210/f52577d1/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4367 bytes
Desc: not available
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20120210/f52577d1/attachment.bin>