BUG - No immediate exit when pipeing ra output

elof2 at sentor.se elof2 at sentor.se
Wed May 11 04:47:29 EDT 2011


On Tue, 10 May 2011, Carter Bullard wrote:
> Not sure where the 288 files == 24 hours?

For example I have a directory with 288 argus-files. Each file hold 5 
minutes worth of data. I.e. 288*5min=1440min=24h.
That was just an example to illustrate that if I want to perform some 
action on the argus data for this day, the shortest execution time even if 
I use -qN1 will be 0.40s using ra v3, where ra v2 take 0.008s.

Even though 0.40s is not a long time, it is 50 times longer than 0.008s.
This is not a biggie, I just wanted to inform you about the big 
difference between v2 and v3.


> ra() is threaded, and may read most of the file in before it starts to write any output.  You could remove the threads and see if it picks up.
>   % ./configure --without-threads; make clean; make

I tried this just now. No change. The unthreaded ra still take 0.40s just 
to start and exit.
   time ra -qN1 -r argus1.log ... argus288.log
   real    0m0.406s
   user    0m0.220s
   sys     0m0.192s

As I said, not a biggie. Just FYI.


> I'll checkout why we may be processing all the input files if the output pipe closes.

Thanks! In my opinion this is a biggie.

/Elof



> On May 10, 2011, at 11:43 AM, elof2 at sentor.se wrote:
>
>>
>> Hi Carter!
>>
>> It seems that the bugfix to detect closure of the pipe still have issues.
>>
>> In ra v2.x I was used to having the ra command terminate in 0.004s.
>> With the new ra 3.0.5.8, it won't terminate as swiftly.
>>
>> ra v3.0.5.8:
>> time ra -nr argus.log - > /dev/null
>> real    0m4.499s
>> user    0m4.288s
>> sys     0m0.212s
>> Time to parse through 100% of this test-file is apparently ~4.5s.
>>
>>
>> Now lets see the time when grep or head terminates directly on the first line:
>> time ra -nr argus.log - | grep -q "."
>> real    0m4.562s
>> user    0m4.392s
>> sys     0m0.176s
>> time ra -nr argus.log - | head -1
>> real    0m4.588s
>> user    0m4.428s
>> sys     0m0.168s
>>
>> Both commands take the full 4.5s (plus a little more) to finish.
>>
>> If I add another (or the same) logfile, it will double the execution time:
>> time ra -nr argus.log argus.log - | head -1
>> real    0m9.002s
>> user    0m8.617s
>> sys     0m0.392s
>>
>> Conclusion:
>> ra 3.0.5.8 don't immediately terminate upon pipe closure.
>>
>>
>>
>> Speaking of latency and execution times...
>>
>> Even when using the builtin -N1 option, ra v2 is way faster than v3:
>> ra v2.x:
>> time ra -N1 -nr argus.log -
>> real    0m0.003s
>> user    0m0.000s
>> sys     0m0.000s
>>
>> ra 3.0.5.8:
>> real    0m0.024s
>> user    0m0.000s
>> sys     0m0.021s
>>
>> v3 is 8 times slower than v2 when it comes to swift termination.
>> Can this be improved?
>>
>>
>> Also, no matter how many files I add to ra v2, it still terminates after 0.003 seconds where as ra v3 will increase its 0.024s by approx 0.002 per file:
>> ra v2.x:
>> time ra -N1 -nr argus.log argus.log argus.log argus.log argus.log argus.log argus.log argus.log argus.log argus.log -
>> real    0m0.003s  <-  still 0.003s
>> user    0m0.000s
>> sys     0m0.002s
>>
>> ra 3.0.5.8:
>> real    0m0.032s  <-  increases more and more for every file added
>> user    0m0.007s
>> sys     0m0.024s
>>
>> If you have 288 argus files in a log dir and use ra -nr argus*.log to do some simple processing, you will always have a noticeable lag just for the sake of the 288 files, even if you use -N1. Can this be improved to be as swift as v2?
>> (288 files á 5 minutes = 24 hours)
>>
>> /Elof
>>
>>
>> On Fri, 29 Apr 2011, Carter Bullard wrote:
>>
>>> Hey Elof,
>>> I have the closed piped issue fixed now.  Its a small fix in every client, so you should wait to get a new client release.
>>
>>> Carter
>>
>>
>>
>>>>>> Two examples using the old ra v2.x :
>>>>>>
>>>>>> 1)
>>>>>> time ra -Zb -nr argus.log - ip | head -1
>>>>>> 04-27-11 23:54:00.925018  s        tcp             10.10.10.10.46382 ->   20.20.20.20.1503   5467  5550  967561   2559850     PA_PA
>>>>>>
>>>>>> real    0m0.004s
>>>>>> user    0m0.000s
>>>>>> sys     0m0.000s
>>>>>>
>>>>>>
>>>>>> 2)
>>>>>> time ra -Zb -nr argus-20110428.log - ip | grep -q "PA_PA$"
>>>>>>
>>>>>> real    0m0.004s
>>>>>> user    0m0.000s
>>>>>> sys     0m0.000s
>>>>>>
>>>>>>
>>>>>> Both examples only take 0.004s to execute.
>>>>>> Good!
>>>>>>
>>>>>>
>>>>>>
>>>>>> Now I try the same using ra v3 :
>>>>>>
>>>>>> 1)
>>>>>> time ra -Zb -nr argus.log - ip | head -1
>>>>>> 04-27-11 23:54:00.925018  s        tcp             30.30.30.30.32451 ->   40.40.40.40.25    52  97  1151   12274     PA_PA
>>>>>>
>>>>>> real    0m50.043s
>>>>>> user    0m48.171s
>>>>>> sys     0m1.892s
>>>>>>
>>>>>>
>>>>>> 2)
>>>>>> time ra -Zb -nr argus-20110429.13.log - ip | grep -q "PA_PA$"
>>>>>>
>>>>>> real    0m50.161s
>>>>>> user    0m48.103s
>>>>>> sys     0m2.068s
>>>>>>
>>>>>>
>>>>>> The commands take 50 seconds (!!!) to complete when they should have taken approx 0.004 seconds.
>>>>>>
>>>>>> The argus.log file is 96 MB and hold 828 372 rows of data.
>>>>>> (The larger the file, the longer I have to wait until the command terminates. If I have an old argus.log holding 24 hours of data, it will take ~30 minutes to parse it (even longer if it is bzipped))
>>>>>>
>>>>>>
>>>>>> It takes 50 seconds just to parse through 100% of the file:
>>>>>> time ra -nr argus.log - ip > /dev/null
>>>>>>
>>>>>> real    0m50.257s
>>>>>> user    0m48.599s
>>>>>> sys     0m1.596s
>


More information about the argus mailing list