Argus Clients ICMP Parket Printer
Carter Bullard
carter at qosient.com
Fri Feb 10 12:08:45 EST 2012
Hmmmmm, that shouldn't have changed, but you never know.
I'll put it back.
Carter
On Feb 9, 2012, at 2:40 PM, Dave Edelman wrote:
> Carter,
>
> I just installed the 3.0.5.31 clients and there seems to be a difference in the way the ICMP type and reason codes are represented. Originally they were shown as four Hex characters, now they seem to be the decimal equivalent. Is there a switch that I missed (or one that I should have missed)
>
> --Dave
>
> From: argus-info-bounces+dedelman=iname.com at lists.andrew.cmu.edu [mailto:argus-info-bounces+dedelman=iname.com at lists.andrew.cmu.edu] On Behalf Of Carter Bullard
> Sent: Tuesday, February 07, 2012 1:27 PM
> To: Marco
> Cc: argus-info at lists.andrew.cmu.edu
> Subject: Re: [ARGUS] Huge argus files and racluster
>
> Hey Marco,
> Argus is very good at not over or undercounting packets, so don't worry about
> the aggregation model and how it affects accuracy, that has been worked over
> very well.
>
> Since you are interested in making sense of it all.
> You should run racount.1 first.
>
> racount -r files -M proto addr
>
> You should be doing some very large aggregations, such as:
>
> racluster -m matrix/16 -r files -s stime dur saddr daddr pkts bytes - ip
>
> This will show you which CIDR /16 networks are talking to whom.
>
> If you want to know the list of IP addresses that are active:
>
> racluster -M rmon -m saddr -r files -w addrs.out - ip
>
> Then you can aggregate for the networks, or the countries or whatever:
> racluster -r addrs.out -m saddr/24 -r files -s stime dur saddr spkts dpkts sbytes dbytes - ip
>
> If you want to aggregate based on the country code, you need to use ralabel to set the
> country codes. Check out 'man ralabel' and 'man 5 ralabel' to see how to do that, and you can
> do that with the IP address file you created above:
>
> ralabel -f ralabel.country.code.conf -r addrs,out -w - | racluster -m sco -w - | \
> rasort -m sco -v -s stime dur sco spkts dpkts sbytes dbytes
>
>
> There are the perl scripts:
>
> rahosts -r files
> raports -r files
>
> These are pretty informative, and will server you well.
> That should get you started.
>
> Carter
>
>
> On Feb 7, 2012, at 11:38 AM, Marco wrote:
>
>
> Thanks for the detailed answer. I suppose a bit more of background on
> what I'm trying to do is in order here. Basically, I've been handed
> that 50GB pcap monster and been told to "make sense of it".
> Essentially, it contains all the traffic to and from the Internet seen
> on a particular LAN.
> "making sense of it" basically means, in simple terms, finding out:
>
> - global bandwidth usage (incoming, outgoing)
> - bandwidth usage by protocol (http, smtp, dns, etc.), again incoming
> and outgoing
> - traffic between specific source/destination hosts (possibly
> including detailed protocol usage within that specific traffic)
>
> Ideally, I'd like to graph some or all of that information, but for
> now I'm ok with running some command line query using racluster/rasort
> to get textual tabular output.
>
> So, based on what I read, the first thing I was doing was trying to
> summarize the pcap data into an argus file to use as a starting point,
> and that file should ideally include exactly one entry per flow (where
> flow==saddr daddr proto sport dport), because otherwise (if I
> understand correctly) packets, bytes, etc. belonging to a specific
> flow would be counted multiple times, which is not what I want (it's
> entirely possible that I'm misunderstanding how argus works though).
> Note that I'm mostly interested in aggregated numbers here rather than
> detailed flow analysis. For example: I'd like to get all flows where
> the protocol is TCP and dport is 80, then obtain aggregated sbytes and
> dbytes for all those flows. Same for other well-known destination
> ports.
>
> As it's probably clear by now, I'm a novice to argus, so any help
> would be appreciated (including pointers to examples or other material
> to study). Thanks for your help.
>
> 2012/2/7 Carter Bullard <carter at qosient.com>:
>
> Hey Marco,
> Regardless of what time range you work with, there will always be
> a flow that extends beyond that range. You have to figure out what
> you are trying to say with the data to decide if you need to count
> every connection only once.
>
> If 5 or 10 or 15 minute files isn't attractive, racluster.1 provides you
> configuration options so you can efficiently track long term flows, but
> it is based on finding an effective idle timeout that will make persistent
> tracking work for your memory limits. See racluster.5. Most flows are
> finished in less than a second, and so keeping all of those flows in memory
> is a waste. Figuring out a good idle timeout strategy, however, is an art.
>
> By default, racluster's idle timeout is "infinite" and so it holds each flow in
> memory until the end of processing. If you decide that 600 seconds
> of idle time is sufficient to decide that the flow is done (120 works for
> most, except Windows boxes, which can send TCP Resets for
> connections that have been closed for well over 300 seconds), then
> a simple racluster.conf file of:
>
> racluster.conf
> filter="" model="saddr daddr proto sport dport" status=0 idle=600
>
> may keep you from running out of memory. If a flow hasn't seen any
> activity in 600 seconds, racluster.1 will report the flow and release
> its memory.
>
> racluster -f racluster.conf -r your.files -w single.output.file
>
> Improving on the aggregation model would include protocol and port
> specific idle time strategies, such as:
>
> racluster.better.conf
> filter="udp and port domain" model="saddr daddr proto sport dport" status=0 idle=10
> filter="udp" model="saddr daddr proto sport dport" status=0 idle=60
> filter="" model="saddr daddr proto sport dport" status=0 idle=600
>
> The output data stream of this type of processing will be semi-sorted
> in last time seen order, rather than start time order, so that may be a
> consideration for you. Sorting currently is a memory hog, so don't
> expect to sort these records after you generate the single output file,
> without some strategy, like using rasplit.1.
>
> Using state, such as TCP closing state to declare that a flow is done, is
> an attractive approach, but it has huge problems, and I don't recommend it.
>
> rasqlinsert.1 is the tool of choice if you really would like to have 1 flow
> record per flow, and you're running out of resources.
>
> Using argus-clients-3.0.5.31 from the developers thread of code,
> use rasqlinsert.1 with the caching option.
>
> rasqlinsert -M cache -r your.files -w mysql://user@localhost/db/raOutfile
>
> This causes rasqlinsert.1 to use a database table as its flow cache.
> Its pretty efficient so its not going to do a database transaction per
> record, if there would be aggregation, so you do get some wins.
> When its finished processing, then create your single file with:
>
> rasql -r mysql://user@localhost/db/raOutfile -w single.output.file
>
>
> There are problems with any approach that aggregates over long periods
> time, because systems do reuse the 5-tuple flow attributes that make
> up a flow key much faster than you would think. This results in many situations
> where multiple independent sessions will be reported as a single very
> long lived flow. This is particularly evident with DNS, where if you aggregate
> over months, you find that you get fewer and fewer DNS transactions (they
> tend to approach somewhere around 32K) between host and server, and
> instead of lasting around 0.025 seconds, they seem to last for months.
>
> I like 5 minute files, and if I need to understand what is going on just at
> the edge of two 5 minute boundaries, I read them both, and focus on the edge
> time boundary. Anything longer than that is another type of time domain,
> and there are lots of processing strategies for developing data at that scale,
> that may be useful.
>
> Carter
>
>
> On Feb 7, 2012, at 9:45 AM, Marco wrote:
>
>
> Thanks. But what about long-lived flows that last more than 5 minutes?
> Will they be merged or will they appear once per 5-minute file in the
> result? The whole point of clustering is having a single entry for
> each of them, AFAIK.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20120210/f52577d1/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4367 bytes
Desc: not available
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20120210/f52577d1/attachment.bin>
More information about the argus
mailing list