BUG - No immediate exit when pipeing ra output

elof2 at sentor.se elof2 at sentor.se
Wed May 25 12:10:10 EDT 2011


YAY!
It works!

time ra -nr argus.log | head -1
real    0m0.041s
user    0m0.024s
sys     0m0.024s


Thanks!

/Elof


On Wed, 25 May 2011, Carter Bullard wrote:

> OK, so I refreshed argus-clients-3.0.5.11.tar.gz with the new strategy for SIGPIPE.
> This should finally do the trick.  Please give it a try as soon as possible.
>
> Carter
>
>
> On May 25, 2011, at 7:43 AM, elof2 at sentor.se wrote:
>
>>
>> Hi Carter.
>> I have some unfortunate news.
>>
>> Ra Version 3.0.5.11 still won't terminate swiftly upon pipe closure on linux:
>>
>> time ra -nr argus2.log | head -1
>> real    0m6.984s
>> user    0m6.708s
>> sys     0m0.284s
>>
>> /Elof
>>
>>
>> On Wed, 18 May 2011, elof2 at sentor.se wrote:
>>
>>>
>>> Hi Carter!
>>> It seems that the bugfix to detect closure of the pipe still have issues, at least on linux.
>>>
>>> I've compiled argus-clients 3.0.5.10 on both linux and FreeBSD.
>>>
>>> On FreeBSD, ra immediately terminates on pipe closure.
>>> On linux however, ra keep processing the entire input file(s) even though the pipe has closed.
>>>
>>> Examples:
>>>
>>> FreeBSD:
>>> time ra -r argus.log | grep -q "^."
>>> real    0m0.015s
>>> user    0m0.009s
>>> sys     0m0.007s
>>>
>>> Linux:
>>> time ra -r argus.log | grep -q "^."
>>> real    0m6.555s
>>> user    0m6.284s
>>> sys     0m0.284s
>>>
>>>
>>>
>>> As a comparison, here is the time to process the entire argus.log:
>>>
>>> FreeBSD:
>>> time ra -r argus-.log > /dev/null
>>> real    0m5.909s
>>> user    0m5.804s
>>> sys     0m0.108s
>>>
>>> Linux:
>>> time ra -r argus.log > /dev/null
>>> real    0m6.461s
>>> user    0m6.204s
>>> sys     0m0.256s
>>>
>>>
>>> Conclusion:
>>> Both FreeBSD and Linux take approx 6 seconds to parse the entire logfile.
>>> On FreeBSD, pipe closure is detected correctly and ra terminates after approx 0.015s. Good.
>>> On linux, pipe closure seem not to be detected at all, and the *entire* file is processed. Ra terminates after approx 6s.
>>> (if the logfile is huge, it takes a loooong time before the ra process terminates)
>>>
>>>
>>> Please let me know if I can help with any additional details to solve this.
>>>
>>> /Elof
>>>
>>>
>>>
>>> On Wed, 11 May 2011, Carter Bullard wrote:
>>>
>>>> On my laptop, I am getting about 250K records per sec throughput with ra(), which is a lot of records per day, but, it's not 1M records per sec, so there is room for improvement.
>>>> There is something in initialization that is very slow, and I'll look at it.  strace() may show what it is.
>>>> Carter
>>>> On May 11, 2011, at 4:47 AM, elof2 at sentor.se wrote:
>>>>> On Tue, 10 May 2011, Carter Bullard wrote:
>>>>>> Not sure where the 288 files == 24 hours?
>>>>> For example I have a directory with 288 argus-files. Each file hold 5 minutes worth of data. I.e. 288*5min=1440min=24h.
>>>>> That was just an example to illustrate that if I want to perform some action on the argus data for this day, the shortest execution time even if I use -qN1 will be 0.40s using ra v3, where ra v2 take 0.008s.
>>>>> Even though 0.40s is not a long time, it is 50 times longer than 0.008s.
>>>>> This is not a biggie, I just wanted to inform you about the big difference between v2 and v3.
>>>>>> ra() is threaded, and may read most of the file in before it starts to write any output.  You could remove the threads and see if it picks up.
>>>>>> % ./configure --without-threads; make clean; make
>>>>> I tried this just now. No change. The unthreaded ra still take 0.40s just to start and exit.
>>>>> time ra -qN1 -r argus1.log ... argus288.log
>>>>> real    0m0.406s
>>>>> user    0m0.220s
>>>>> sys     0m0.192s
>>>>> As I said, not a biggie. Just FYI.
>>>>>> I'll checkout why we may be processing all the input files if the output pipe closes.
>>>>> Thanks! In my opinion this is a biggie.
>>>>> /Elof
>>>>>> On May 10, 2011, at 11:43 AM, elof2 at sentor.se wrote:
>>>>>>> Hi Carter!
>>>>>>> It seems that the bugfix to detect closure of the pipe still have issues.
>>>>>>> In ra v2.x I was used to having the ra command terminate in 0.004s.
>>>>>>> With the new ra 3.0.5.8, it won't terminate as swiftly.
>>>>>>> ra v3.0.5.8:
>>>>>>> time ra -nr argus.log - > /dev/null
>>>>>>> real    0m4.499s
>>>>>>> user    0m4.288s
>>>>>>> sys     0m0.212s
>>>>>>> Time to parse through 100% of this test-file is apparently ~4.5s.
>>>>>>> Now lets see the time when grep or head terminates directly on the first line:
>>>>>>> time ra -nr argus.log - | grep -q "."
>>>>>>> real    0m4.562s
>>>>>>> user    0m4.392s
>>>>>>> sys     0m0.176s
>>>>>>> time ra -nr argus.log - | head -1
>>>>>>> real    0m4.588s
>>>>>>> user    0m4.428s
>>>>>>> sys     0m0.168s
>>>>>>> Both commands take the full 4.5s (plus a little more) to finish.
>>>>>>> If I add another (or the same) logfile, it will double the execution time:
>>>>>>> time ra -nr argus.log argus.log - | head -1
>>>>>>> real    0m9.002s
>>>>>>> user    0m8.617s
>>>>>>> sys     0m0.392s
>>>>>>> Conclusion:
>>>>>>> ra 3.0.5.8 don't immediately terminate upon pipe closure.
>>>>>>> Speaking of latency and execution times...
>>>>>>> Even when using the builtin -N1 option, ra v2 is way faster than v3:
>>>>>>> ra v2.x:
>>>>>>> time ra -N1 -nr argus.log -
>>>>>>> real    0m0.003s
>>>>>>> user    0m0.000s
>>>>>>> sys     0m0.000s
>>>>>>> ra 3.0.5.8:
>>>>>>> real    0m0.024s
>>>>>>> user    0m0.000s
>>>>>>> sys     0m0.021s
>>>>>>> v3 is 8 times slower than v2 when it comes to swift termination.
>>>>>>> Can this be improved?
>>>>>>> Also, no matter how many files I add to ra v2, it still terminates after 0.003 seconds where as ra v3 will increase its 0.024s by approx 0.002 per file:
>>>>>>> ra v2.x:
>>>>>>> time ra -N1 -nr argus.log argus.log argus.log argus.log argus.log argus.log argus.log argus.log argus.log argus.log -
>>>>>>> real    0m0.003s  <-  still 0.003s
>>>>>>> user    0m0.000s
>>>>>>> sys     0m0.002s
>>>>>>> ra 3.0.5.8:
>>>>>>> real    0m0.032s  <-  increases more and more for every file added
>>>>>>> user    0m0.007s
>>>>>>> sys     0m0.024s
>>>>>>> If you have 288 argus files in a log dir and use ra -nr argus*.log to do some simple processing, you will always have a noticeable lag just for the sake of the 288 files, even if you use -N1. Can this be improved to be as swift as v2?
>>>>>>> (288 files á 5 minutes = 24 hours)
>>>>>>> /Elof
>>>>>>> On Fri, 29 Apr 2011, Carter Bullard wrote:
>>>>>>>> Hey Elof,
>>>>>>>> I have the closed piped issue fixed now.  Its a small fix in every client, so you should wait to get a new client release.
>>>>>>>> Carter
>>>>>>>>>>> Two examples using the old ra v2.x :
>>>>>>>>>>> 1)
>>>>>>>>>>> time ra -Zb -nr argus.log - ip | head -1
>>>>>>>>>>> 04-27-11 23:54:00.925018  s        tcp             10.10.10.10.46382 ->   20.20.20.20.1503   5467  5550  967561   2559850     PA_PA
>>>>>>>>>>> real    0m0.004s
>>>>>>>>>>> user    0m0.000s
>>>>>>>>>>> sys     0m0.000s
>>>>>>>>>>> 2)
>>>>>>>>>>> time ra -Zb -nr argus-20110428.log - ip | grep -q "PA_PA$"
>>>>>>>>>>> real    0m0.004s
>>>>>>>>>>> user    0m0.000s
>>>>>>>>>>> sys     0m0.000s
>>>>>>>>>>> Both examples only take 0.004s to execute.
>>>>>>>>>>> Good!
>>>>>>>>>>> Now I try the same using ra v3 :
>>>>>>>>>>> 1)
>>>>>>>>>>> time ra -Zb -nr argus.log - ip | head -1
>>>>>>>>>>> 04-27-11 23:54:00.925018  s        tcp             30.30.30.30.32451 ->   40.40.40.40.25    52  97  1151   12274     PA_PA
>>>>>>>>>>> real    0m50.043s
>>>>>>>>>>> user    0m48.171s
>>>>>>>>>>> sys     0m1.892s
>>>>>>>>>>> 2)
>>>>>>>>>>> time ra -Zb -nr argus-20110429.13.log - ip | grep -q "PA_PA$"
>>>>>>>>>>> real    0m50.161s
>>>>>>>>>>> user    0m48.103s
>>>>>>>>>>> sys     0m2.068s
>>>>>>>>>>> The commands take 50 seconds (!!!) to complete when they should have taken approx 0.004 seconds.
>>>>>>>>>>> The argus.log file is 96 MB and hold 828 372 rows of data.
>>>>>>>>>>> (The larger the file, the longer I have to wait until the command terminates. If I have an old argus.log holding 24 hours of data, it will take ~30 minutes to parse it (even longer if it is bzipped))
>>>>>>>>>>> It takes 50 seconds just to parse through 100% of the file:
>>>>>>>>>>> time ra -nr argus.log - ip > /dev/null
>>>>>>>>>>> real    0m50.257s
>>>>>>>>>>> user    0m48.599s
>>>>>>>>>>> sys     0m1.596s
>>>
>
>


More information about the argus mailing list