Huge argus files and racluster

Marco listaddr at gmail.com
Wed Feb 8 07:04:40 EST 2012


No, I'm using 3.0.4.1. I see the latest one doesn't segfault, thanks.

Il 08 febbraio 2012 12:57, Carter Bullard <carter at qosient.com> ha scritto:
> Are you using the latest client programs?
>
> http://qosient.com/argus/dev/argus-clients-latest.tar.gz
>
> Carter
>
> On Feb 8, 2012, at 5:57 AM, Marco <listaddr at gmail.com> wrote:
>
>> Thanks, that was very useful. Now I'm running into another couple of issues.
>>
>> The first one is that rabin (invoked from ragraph) segfaults if the
>> filter I specify does not match any flow. Can provide sample data if
>> needed. In this specific case, the command line I'm using is
>>
>> $ ragraph sbytes dbytes  -M 10s -n -r sample.argus -m proto  -w
>> mygraph.png -title "abc" - dst host 10.192.1.138
>> sh: line 1: 19240 Segmentation fault      /usr/bin/rabins -M hard zero
>> -p6 -GL0 -s ltime sbytes dbytes -M 10s -n -r sample.argus -m proto -
>> dst host 10.192.1.138 > /tmp/filepZbhuC
>> usage: /usr/bin/ragraph metric (srcid | proto [daddr] | dport) [-title
>> "title"] [ra-options]
>> /usr/bin/ragraph: unable to create `/tmp/filepZbhuC.rrd': start time:
>> unparsable time:
>>
>> The file sample.argus happens to not have any flow where the
>> destination IP is 10.192.1.138.
>>
>> The second one is more of a philosophical issue, and now I'm wondering
>> whether argus is really the tool I need for the task. Basically, since
>> I need to determine incoming/outgoing bandwidth usage, I'm using
>> "ragraph sbytes dbytes" to produce the graphics. If every flow is
>> initiated from the LAN being monitored (say, 192.168.44.0/24), then
>> "sbytes" effectively indicates the amount of physically outgoing data,
>> and "dbytes" effectively indicates the amount of physically incoming
>> data, so the resulting graph is an accurate representation of in/out
>> bandwidth usage over time. But if there are externally-initiated flows
>> (as in my case) along with internally-initiated ones, then "sbytes"
>> will aggregate a mixture of data leaving and entering the network,
>> depending on who initiated the flow being considered. The same happens
>> for "dbytes". What this means is that a "ragraph sbytes dbytes" isn't
>> really representing bandwidth usage in the two directions (the
>> aggregate value "bytes" is still correct, but doesn't give any detail
>> about what's outgoing and what's incoming).
>> So obviously this isn't argus' fault, but I'm wondering whether
>> there's a way to do what I'm looking for (with argus or another tool).
>> It's also possible that I'm missing something obvious.
>>
>> Thanks again for the quick replies and your patience.
>>
>>
>> Il 07 febbraio 2012 19:26, Carter Bullard <carter at qosient.com> ha scritto:
>>> Hey Marco,
>>> Argus is very good at not over or undercounting packets, so don't worry
>>> about
>>> the aggregation model and how it affects accuracy, that has been worked over
>>> very well.
>>>
>>> Since you are interested in making sense of it all.
>>> You should run racount.1 first.
>>>
>>>    racount -r files -M proto addr
>>>
>>> You should be doing some very large aggregations, such as:
>>>
>>>    racluster -m matrix/16 -r files -s stime dur saddr daddr pkts bytes - ip
>>>
>>> This will show you which CIDR /16 networks are talking to whom.
>>>
>>> If you want to know the list of IP addresses that are active:
>>>
>>>    racluster -M rmon -m saddr -r files -w addrs.out - ip
>>>
>>> Then you can aggregate for the networks, or the countries or whatever:
>>>    racluster -r addrs.out -m saddr/24 -r files -s stime dur saddr spkts
>>> dpkts sbytes dbytes -  ip
>>>
>>> If you want to aggregate based on the country code, you need to use ralabel
>>> to set the
>>> country codes.  Check out 'man ralabel' and 'man 5 ralabel' to see how to do
>>> that, and you can
>>> do that with the IP address file you created above:
>>>
>>>    ralabel -f ralabel.country.code.conf -r addrs,out -w - |  racluster -m
>>> sco -w - | \
>>>       rasort -m sco -v -s stime dur sco spkts dpkts sbytes dbytes
>>>
>>>
>>> There are the perl scripts:
>>>
>>>    rahosts -r files
>>>    raports -r files
>>>
>>> These are pretty informative, and will server you well.
>>> That should get you started.
>>>
>>> Carter
>>>
>>>
>>> On Feb 7, 2012, at 11:38 AM, Marco wrote:
>>>
>>> Thanks for the detailed answer. I suppose a bit more of background on
>>> what I'm trying to do is in order here. Basically, I've been handed
>>> that 50GB pcap monster and been told to "make sense of it".
>>> Essentially, it contains all the traffic to and from the Internet seen
>>> on a particular LAN.
>>> "making sense of it" basically means, in simple terms, finding out:
>>>
>>> - global bandwidth usage (incoming, outgoing)
>>> - bandwidth usage by protocol (http, smtp, dns, etc.), again incoming
>>> and outgoing
>>> - traffic between specific source/destination hosts (possibly
>>> including detailed protocol usage within that specific traffic)
>>>
>>> Ideally, I'd like to graph some or all of that information, but for
>>> now I'm ok with running some command line query using racluster/rasort
>>> to get textual tabular output.
>>>
>>> So, based on what I read, the first thing I was doing was trying to
>>> summarize the pcap data into an argus file to use as a starting point,
>>> and that file should ideally include exactly one entry per flow (where
>>> flow==saddr daddr proto sport dport), because otherwise (if I
>>> understand correctly) packets, bytes, etc. belonging to a specific
>>> flow would be counted multiple times, which is not what I want (it's
>>> entirely possible that I'm misunderstanding how argus works though).
>>> Note that I'm mostly interested in aggregated numbers here rather than
>>> detailed flow analysis. For example: I'd like to get all flows where
>>> the protocol is TCP and dport is 80, then obtain aggregated sbytes and
>>> dbytes for all those flows. Same for other well-known destination
>>> ports.
>>>
>>> As it's probably clear by now, I'm a novice to argus, so any help
>>> would be appreciated (including pointers to examples or other material
>>> to study). Thanks for your help.
>>>
>>> 2012/2/7 Carter Bullard <carter at qosient.com>:
>>>
>>> Hey Marco,
>>>
>>> Regardless of what time range you work with, there will always be
>>>
>>> a flow that extends beyond that range.  You have to figure out what
>>>
>>> you are trying to say with the data to decide if you need to count
>>>
>>> every connection only once.
>>>
>>>
>>> If 5 or 10 or 15 minute files isn't attractive, racluster.1 provides you
>>>
>>> configuration options so you can efficiently track long term flows, but
>>>
>>> it is based on finding an effective idle timeout that will make persistent
>>>
>>> tracking work for your memory limits.  See racluster.5.  Most flows are
>>>
>>> finished in less than a second, and so keeping all of those flows in memory
>>>
>>> is a waste.  Figuring out a good idle timeout strategy, however, is an art.
>>>
>>>
>>> By default, racluster's idle timeout is "infinite" and so it holds each flow
>>> in
>>>
>>> memory until the end of processing.  If you decide that 600 seconds
>>>
>>> of idle time is sufficient to decide that the flow is done (120 works for
>>>
>>> most, except Windows boxes, which can send TCP Resets for
>>>
>>> connections that have been closed for well over 300 seconds), then
>>>
>>> a simple racluster.conf file of:
>>>
>>>
>>> racluster.conf
>>>
>>>    filter="" model="saddr daddr proto sport dport" status=0 idle=600
>>>
>>>
>>> may keep you from running out of memory.  If a flow hasn't seen any
>>>
>>> activity in 600 seconds, racluster.1 will report the flow and release
>>>
>>> its memory.
>>>
>>>
>>>    racluster -f racluster.conf -r your.files -w single.output.file
>>>
>>>
>>> Improving on the aggregation model would include protocol and port
>>>
>>> specific idle time strategies, such as:
>>>
>>>
>>> racluster.better.conf
>>>
>>>    filter="udp and port domain" model="saddr daddr proto sport dport"
>>> status=0 idle=10
>>>
>>>    filter="udp" model="saddr daddr proto sport dport" status=0 idle=60
>>>
>>>    filter="" model="saddr daddr proto sport dport" status=0 idle=600
>>>
>>>
>>> The output data stream of this type of processing will be semi-sorted
>>>
>>> in last time seen order, rather than start time order, so that may be a
>>>
>>> consideration for you.  Sorting currently is a memory hog, so don't
>>>
>>> expect to sort these records after you generate the single output file,
>>>
>>> without some strategy, like using rasplit.1.
>>>
>>>
>>> Using state, such as TCP closing state to declare that a flow is done, is
>>>
>>> an attractive approach, but it has huge problems, and I don't recommend it.
>>>
>>>
>>> rasqlinsert.1 is the tool of choice if you really would like to have 1 flow
>>>
>>> record per flow, and you're running out of resources.
>>>
>>>
>>> Using argus-clients-3.0.5.31 from the developers thread of code,
>>>
>>> use rasqlinsert.1 with the caching option.
>>>
>>>
>>>   rasqlinsert -M cache -r your.files -w mysql://user@localhost/db/raOutfile
>>>
>>>
>>> This causes rasqlinsert.1 to use a database table as its flow cache.
>>>
>>> Its pretty efficient so its not going to do a database transaction per
>>>
>>> record, if there would be aggregation, so you do get some wins.
>>>
>>> When its finished processing, then create your single file with:
>>>
>>>
>>>   rasql -r mysql://user@localhost/db/raOutfile -w single.output.file
>>>
>>>
>>>
>>> There are problems with any approach that aggregates over long periods
>>>
>>> time, because systems do reuse the 5-tuple flow attributes that make
>>>
>>> up a flow key much faster than you would think.  This results in many
>>> situations
>>>
>>> where multiple independent sessions will be reported as a single very
>>>
>>> long lived flow.  This is particularly evident with DNS, where if you
>>> aggregate
>>>
>>> over months, you find that you get fewer and fewer DNS transactions (they
>>>
>>> tend to approach somewhere around 32K) between host and server, and
>>>
>>> instead of lasting around 0.025 seconds, they seem to last for months.
>>>
>>>
>>> I like 5 minute files, and if I need to understand what is going on just at
>>>
>>> the edge of two 5 minute boundaries, I read them both, and focus on the edge
>>>
>>> time boundary.  Anything longer than that is another type of time domain,
>>>
>>> and there are lots of processing strategies for developing data at that
>>> scale,
>>>
>>> that may be useful.
>>>
>>>
>>> Carter
>>>
>>>
>>>
>>> On Feb 7, 2012, at 9:45 AM, Marco wrote:
>>>
>>>
>>>
>>> Thanks. But what about long-lived flows that last more than 5 minutes?
>>>
>>> Will they be merged or will they appear once per 5-minute file in the
>>>
>>> result? The whole point of clustering is having a single entry for
>>>
>>> each of them, AFAIK.
>>>
>>>



More information about the argus mailing list