ra reads argus file very slow

Carter Bullard carter at qosient.com
Fri Nov 1 09:44:13 EDT 2013


Hey Zi,
Options are parsed from left to right so in your example, the -X negates
the -u option used ....   Try these for comparison:

   ra -Xu -r 2013-09-01-0700/temp/20130831-223000-hWukIYC-lander4.argus -N 5000000 >/dev/null

   ra -Xu -r 2013-09-01-0700/temp/20130831-223000-hWukIYC-lander4.argus -N 5000000 -w /dev/null

        This takes out the actual string printing.  Should go as fast as racount().

   ra -Xu -r 2013-09-01-0700/temp/20130831-223000-hWukIYC-lander4.argus -N 5000000 -s saddr > /dev/null
   
        This prints only one field.

Carter


On Oct 31, 2013, at 6:39 PM, Zi Hu <zihu at usc.edu> wrote:

> Hi, Carter, thanks for your suggestion.
> I just tried the latest version of clients as you suggested, but it doesn't make any difference.
> 
> Besides, one thing that is not clear to me is why the performance is non-linear in the number or records.  E.g.:
> time ra -uX -r 2013-09-01-0700/temp/20130831-223000-hWukIYC-lander4.argus -N 5000000 >/dev/null
>     real    5m58.211s
>     user    5m54.969s
>     sys     0m3.027s
> time ra -uX -r 2013-09-01-0700/temp/20130831-223000-hWukIYC-lander4.argus -N 10000000 >/dev/null
>     real    22m2.487s
>     user    21m55.113s
>     sys     0m6.782s 
> 
> thanks
> -Zi
> 
> 
> On Thu, Oct 31, 2013 at 5:24 AM, Carter Bullard <carter at qosient.com> wrote:
> Hey Zi,
> Could you try the newer developers version of the clients, to see if you see any difference ???  This is the code that will become argus-clients-3.0.8.
> 
>    http://qosient.com/argus/dev/argus-clients-latest.tar.gz
> 
> Carter
> 
> Carter Bullard, QoSient, LLC
> 150 E. 57th Street Suite 12D
> New York, New York 10022
> +1 212 588-9133 Phone
> +1 212 588-9134 Fax
> 
> On Oct 31, 2013, at 3:41 AM, Zi Hu <zihu at usc.edu> wrote:
> 
>> On Wed, Oct 30, 2013 at 9:39 PM, Carter Bullard <carter at qosient.com> wrote:
>> Hey Zi,
>> Well, based on the performance of racount(), I'd say the subject line is a little off, in that we can read the file, and decode all the records pretty quickly,... 10.29 seconds.   Looks like the ra* programs can process the file faster than you can cat() it, so I'd say the problem is in writing to the disk.  Maybe you have some disk errors??  Did you check your system logs ???
>> 
>>  
>> Hi, Carter, 
>> Thanks for your comments, but I didn't see any disk errors from the system logs. 
>> Moreover, I don't think disk errors are the cause, since I "cat" and "ra" the same file on the same machine. If I have some disk errors, they both should be slow.  Besides,  I also copy the 2G argus file to another machine, still it takes more than 80 minutes to read the file with "ra". 
>> 
>> I did another test: 
>> I made another ~2G argus file and run "ra" on it, this time it is much faster (about 12 minutes), although it is still slow compared to "cat" (about 24 seconds). 
>> zihu at proton:~$  time ra -r tmp/201320d-060000.argus -u > temp.dat
>> 
>> real    11m55.636s
>> user    11m16.636s
>> sys     0m38.653s
>> 
>> zihu at proton:~$ time cat tmp/201320d-060000.argus > temp.dat
>> 
>> real    0m24.298s
>> user    0m0.009s
>> sys     0m3.747s
>> 
>> zihu at proton:~$ time racount -r tmp/201320d-060000.argus
>> racount   records     total_pkts     src_pkts       dst_pkts       total_bytes        src_bytes          dst_bytes
>>     sum   18357344    814467265      557563621      256903644      937278498435       620800235862       316478262573      
>> 
>> real    0m10.753s
>> user    0m9.902s
>> sys     0m0.832s
>> 
>> 
>> For me, it looks like "ra" runs fast on some files, while it becomes slow on certain files. 
>> Do you have a reason why "ra" performs differently on different files? Could this be a potential bug of "ra"?
>> By the way, it is still not quite clear for me why the memory keeps growing when I run the "ra" command. 
>> 
>> -Zi
>> 
>>  
>> Carter
>> 
>> On Oct 30, 2013, at 8:43 PM, Zi Hu <zihu at usc.edu> wrote:
>> 
>>> 
>>> Thanks for your reply, Carter. 
>>> 
>>> On Wed, Oct 30, 2013 at 4:27 PM, Carter Bullard <carter at qosient.com> wrote:
>>> Hey Zi,
>>> The only time I’ve seen ra() have problems reading and writing
>>> data, to the level you report, is when one tries to do DNS
>>> lookups to get the names of the IP addresses, instead of
>>> dotted decimal notation.
>>> 
>>> 
>>> By default, "ra" won't perform DNS lookups right? If this is true, given the command line I used in my experiment, I don't think it does DNS lookups.  
>>> Besides, I also tried -nn option, it doesn't make much difference. 
>>>  
>>> I can read about 2G of flow data in about 65 secs, on a
>>> standard machine, but I can cat() that file in about
>>> 2.5 secs, so your machine may not be performing as well
>>> as you would want.
>>> 
>>> What version of argus and clients are you using??
>>> 
>>> 3.0.6
>>>  
>>> Do you have a .rarc file in your home directory?
>>> 
>>> I don't see a .rarc file in my home directory. 
>>> 
>>>  
>>> What does a line of ra() output look like ?
>>> 
>>> 
>>> zihu at proton:~$ ra -r 2013-09-01-0700/temp/20130831-223000-hWukIYC-lander4.argus -u | head
>>>          StartTime      Flgs  Proto            SrcAddr  Sport   Dir            DstAddr  Dport  TotPkts   TotBytes State 
>>>  1378009800.024648  e           tcp      129.82.228.28.11021    <?>     74.125.142.131.xmpp-*        2        144   CON
>>>  1378009800.000000  e d         tcp      129.82.97.104.63194     ->     129.82.224.179.https        10       1154   CON
>>>  1378009800.132037  e           tcp       129.82.12.68.57547     ->       75.130.96.44.59943         2       1414   CON
>>>  1378009800.131337  e           udp       129.82.12.66.44115    <->    131.254.208.196.44295         2        234   CON
>>>  1378009800.000000  e d         tcp     129.82.227.103.ica      <?>       129.82.97.52.49341        11        882   CON
>>>  1378009800.173511  e           udp       129.82.12.66.44115    <->     211.69.207.154.38275         2        215   CON
>>>  1378009800.619227  e          icmp       129.82.12.68.0x0303    ->    143.215.131.247.0xd782        1        102   URP
>>>  1378009800.623714  e          icmp       129.82.12.68.0x0303    ->    143.215.131.247.0xd882        1        102   URP
>>>  1378009800.719767  e          icmp      192.43.217.17.0x000b    ->       129.82.12.68.0x0000        1         70   TXD
>>> 
>>> 
>>> 
>>> Besides, the following is some information about the 2G argus file on my machine, not sure if this can help you to diagnose the issue. 
>>> zihu at proton:~$ time racount -r 2013-09-01-0700/temp/20130831-223000-hWukIYC-lander4.argus
>>> racount   records     total_pkts     src_pkts       dst_pkts       total_bytes        src_bytes          dst_bytes
>>>     sum   20327732    127070924      81280364       45790560       108939377747       66625641107        42313736640       
>>> 
>>> real	0m10.297s
>>> user	0m9.478s
>>> sys	0m0.780s
>>> 
>>> 
>>> thanks
>>> -Zi
>>> 
>>>  
>>> Carter
>>> 
>>> 
>>> 
>>> On Oct 30, 2013, at 6:34 PM, Zi Hu <zihu at usc.edu> wrote:
>>> 
>>>> Hi, Carter,
>>>> 
>>>> In my application, I need a simple tool to read what it is in the argus file, then output certain fields that I am interested in ascii format, such as srcip, dstip, sport, dport. protocol, .... 
>>>> 
>>>> I thought the command "ra" is what I need. However, I find it is very slow to read the argus data with "ra".  I did a small experiment: dump the same argus file (about 2G) with both "ra" and "cat".
>>>> Using the "ra" command, it took me about 87 minutes to read the file, while it took only 40 seconds to dump it with "cat".  and also I notice that the memory keeps growing when I am running "ra".
>>>> 
>>>> zihu at proton:~$ time cat 2013-09-01-0700/temp/20130831-223000-hWukIYC-lander4.argus > temp.dat
>>>> 
>>>> real    0m39.490s
>>>> user    0m0.027s
>>>> sys     0m4.204s
>>>> zihu at proton:~$ time ra -r 2013-09-01-0700/temp/20130831-223000-hWukIYC-lander4.argus -u > temp.dat
>>>> 
>>>> real    87m40.973s
>>>> user    86m42.397s
>>>> sys     0m56.256s
>>>> zihu at proton:~$ 
>>>> 
>>>>  
>>>> 
>>>> So I guess "ra" does more than just reading the argus file, formatting and outputing the result.   Does "ra" keep track of flows in memory so that the memory keeps growing ?
>>>> 
>>>> If "ra" is not the right choice for my application, then what's the right command for this simple application? Or if we don't have such a tool, I am thinking of writing one by myself. Could you point me where to start?  Any suggestions are welcomed. 
>>>> 
>>>> 
>>>> Thanks
>>>> -Zi
>>> 
>>> 
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20131101/e223cd70/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 6837 bytes
Desc: not available
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20131101/e223cd70/attachment.bin>


More information about the argus mailing list