BUG - No immediate exit when pipeing ra output

elof2 at sentor.se elof2 at sentor.se
Wed May 25 07:43:31 EDT 2011


Hi Carter.
I have some unfortunate news.

Ra Version 3.0.5.11 still won't terminate swiftly upon pipe closure on 
linux:

time ra -nr argus2.log | head -1
real    0m6.984s
user    0m6.708s
sys     0m0.284s

/Elof


On Wed, 18 May 2011, elof2 at sentor.se wrote:

>
> Hi Carter!
> It seems that the bugfix to detect closure of the pipe still have issues, at 
> least on linux.
>
> I've compiled argus-clients 3.0.5.10 on both linux and FreeBSD.
>
> On FreeBSD, ra immediately terminates on pipe closure.
> On linux however, ra keep processing the entire input file(s) even though the 
> pipe has closed.
>
> Examples:
>
> FreeBSD:
> time ra -r argus.log | grep -q "^."
> real    0m0.015s
> user    0m0.009s
> sys     0m0.007s
>
> Linux:
> time ra -r argus.log | grep -q "^."
> real    0m6.555s
> user    0m6.284s
> sys     0m0.284s
>
>
>
> As a comparison, here is the time to process the entire argus.log:
>
> FreeBSD:
> time ra -r argus-.log > /dev/null
> real    0m5.909s
> user    0m5.804s
> sys     0m0.108s
>
> Linux:
> time ra -r argus.log > /dev/null
> real    0m6.461s
> user    0m6.204s
> sys     0m0.256s
>
>
> Conclusion:
> Both FreeBSD and Linux take approx 6 seconds to parse the entire logfile.
> On FreeBSD, pipe closure is detected correctly and ra terminates after approx 
> 0.015s. Good.
> On linux, pipe closure seem not to be detected at all, and the *entire* file 
> is processed. Ra terminates after approx 6s.
> (if the logfile is huge, it takes a loooong time before the ra process 
> terminates)
>
>
> Please let me know if I can help with any additional details to solve this.
>
> /Elof
>
>
>
> On Wed, 11 May 2011, Carter Bullard wrote:
>
>> On my laptop, I am getting about 250K records per sec throughput with ra(), 
>> which is a lot of records per day, but, it's not 1M records per sec, so 
>> there is room for improvement.
>> 
>> There is something in initialization that is very slow, and I'll look at 
>> it.  strace() may show what it is.
>> 
>> Carter
>> 
>> On May 11, 2011, at 4:47 AM, elof2 at sentor.se wrote:
>> 
>>> 
>>> On Tue, 10 May 2011, Carter Bullard wrote:
>>>> Not sure where the 288 files == 24 hours?
>>> 
>>> For example I have a directory with 288 argus-files. Each file hold 5 
>>> minutes worth of data. I.e. 288*5min=1440min=24h.
>>> That was just an example to illustrate that if I want to perform some 
>>> action on the argus data for this day, the shortest execution time even if 
>>> I use -qN1 will be 0.40s using ra v3, where ra v2 take 0.008s.
>>> 
>>> Even though 0.40s is not a long time, it is 50 times longer than 0.008s.
>>> This is not a biggie, I just wanted to inform you about the big difference 
>>> between v2 and v3.
>>> 
>>> 
>>>> ra() is threaded, and may read most of the file in before it starts to 
>>>> write any output.  You could remove the threads and see if it picks up.
>>>>  % ./configure --without-threads; make clean; make
>>> 
>>> I tried this just now. No change. The unthreaded ra still take 0.40s just 
>>> to start and exit.
>>>  time ra -qN1 -r argus1.log ... argus288.log
>>>  real    0m0.406s
>>>  user    0m0.220s
>>>  sys     0m0.192s
>>> 
>>> As I said, not a biggie. Just FYI.
>>> 
>>> 
>>>> I'll checkout why we may be processing all the input files if the output 
>>>> pipe closes.
>>> 
>>> Thanks! In my opinion this is a biggie.
>>> 
>>> /Elof
>>> 
>>> 
>>> 
>>>> On May 10, 2011, at 11:43 AM, elof2 at sentor.se wrote:
>>>> 
>>>>> 
>>>>> Hi Carter!
>>>>> 
>>>>> It seems that the bugfix to detect closure of the pipe still have 
>>>>> issues.
>>>>> 
>>>>> In ra v2.x I was used to having the ra command terminate in 0.004s.
>>>>> With the new ra 3.0.5.8, it won't terminate as swiftly.
>>>>> 
>>>>> ra v3.0.5.8:
>>>>> time ra -nr argus.log - > /dev/null
>>>>> real    0m4.499s
>>>>> user    0m4.288s
>>>>> sys     0m0.212s
>>>>> Time to parse through 100% of this test-file is apparently ~4.5s.
>>>>> 
>>>>> 
>>>>> Now lets see the time when grep or head terminates directly on the first 
>>>>> line:
>>>>> time ra -nr argus.log - | grep -q "."
>>>>> real    0m4.562s
>>>>> user    0m4.392s
>>>>> sys     0m0.176s
>>>>> time ra -nr argus.log - | head -1
>>>>> real    0m4.588s
>>>>> user    0m4.428s
>>>>> sys     0m0.168s
>>>>> 
>>>>> Both commands take the full 4.5s (plus a little more) to finish.
>>>>> 
>>>>> If I add another (or the same) logfile, it will double the execution 
>>>>> time:
>>>>> time ra -nr argus.log argus.log - | head -1
>>>>> real    0m9.002s
>>>>> user    0m8.617s
>>>>> sys     0m0.392s
>>>>> 
>>>>> Conclusion:
>>>>> ra 3.0.5.8 don't immediately terminate upon pipe closure.
>>>>> 
>>>>> 
>>>>> 
>>>>> Speaking of latency and execution times...
>>>>> 
>>>>> Even when using the builtin -N1 option, ra v2 is way faster than v3:
>>>>> ra v2.x:
>>>>> time ra -N1 -nr argus.log -
>>>>> real    0m0.003s
>>>>> user    0m0.000s
>>>>> sys     0m0.000s
>>>>> 
>>>>> ra 3.0.5.8:
>>>>> real    0m0.024s
>>>>> user    0m0.000s
>>>>> sys     0m0.021s
>>>>> 
>>>>> v3 is 8 times slower than v2 when it comes to swift termination.
>>>>> Can this be improved?
>>>>> 
>>>>> 
>>>>> Also, no matter how many files I add to ra v2, it still terminates after 
>>>>> 0.003 seconds where as ra v3 will increase its 0.024s by approx 0.002 
>>>>> per file:
>>>>> ra v2.x:
>>>>> time ra -N1 -nr argus.log argus.log argus.log argus.log argus.log 
>>>>> argus.log argus.log argus.log argus.log argus.log -
>>>>> real    0m0.003s  <-  still 0.003s
>>>>> user    0m0.000s
>>>>> sys     0m0.002s
>>>>> 
>>>>> ra 3.0.5.8:
>>>>> real    0m0.032s  <-  increases more and more for every file added
>>>>> user    0m0.007s
>>>>> sys     0m0.024s
>>>>> 
>>>>> If you have 288 argus files in a log dir and use ra -nr argus*.log to do 
>>>>> some simple processing, you will always have a noticeable lag just for 
>>>>> the sake of the 288 files, even if you use -N1. Can this be improved to 
>>>>> be as swift as v2?
>>>>> (288 files á 5 minutes = 24 hours)
>>>>> 
>>>>> /Elof
>>>>> 
>>>>> 
>>>>> On Fri, 29 Apr 2011, Carter Bullard wrote:
>>>>> 
>>>>>> Hey Elof,
>>>>>> I have the closed piped issue fixed now.  Its a small fix in every 
>>>>>> client, so you should wait to get a new client release.
>>>>> 
>>>>>> Carter
>>>>> 
>>>>> 
>>>>> 
>>>>>>>>> Two examples using the old ra v2.x :
>>>>>>>>> 
>>>>>>>>> 1)
>>>>>>>>> time ra -Zb -nr argus.log - ip | head -1
>>>>>>>>> 04-27-11 23:54:00.925018  s        tcp             10.10.10.10.46382 
>>>>>>>>> ->   20.20.20.20.1503   5467  5550  967561   2559850     PA_PA
>>>>>>>>> 
>>>>>>>>> real    0m0.004s
>>>>>>>>> user    0m0.000s
>>>>>>>>> sys     0m0.000s
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 2)
>>>>>>>>> time ra -Zb -nr argus-20110428.log - ip | grep -q "PA_PA$"
>>>>>>>>> 
>>>>>>>>> real    0m0.004s
>>>>>>>>> user    0m0.000s
>>>>>>>>> sys     0m0.000s
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Both examples only take 0.004s to execute.
>>>>>>>>> Good!
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Now I try the same using ra v3 :
>>>>>>>>> 
>>>>>>>>> 1)
>>>>>>>>> time ra -Zb -nr argus.log - ip | head -1
>>>>>>>>> 04-27-11 23:54:00.925018  s        tcp             30.30.30.30.32451 
>>>>>>>>> ->   40.40.40.40.25    52  97  1151   12274     PA_PA
>>>>>>>>> 
>>>>>>>>> real    0m50.043s
>>>>>>>>> user    0m48.171s
>>>>>>>>> sys     0m1.892s
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 2)
>>>>>>>>> time ra -Zb -nr argus-20110429.13.log - ip | grep -q "PA_PA$"
>>>>>>>>> 
>>>>>>>>> real    0m50.161s
>>>>>>>>> user    0m48.103s
>>>>>>>>> sys     0m2.068s
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> The commands take 50 seconds (!!!) to complete when they should have 
>>>>>>>>> taken approx 0.004 seconds.
>>>>>>>>> 
>>>>>>>>> The argus.log file is 96 MB and hold 828 372 rows of data.
>>>>>>>>> (The larger the file, the longer I have to wait until the command 
>>>>>>>>> terminates. If I have an old argus.log holding 24 hours of data, it 
>>>>>>>>> will take ~30 minutes to parse it (even longer if it is bzipped))
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> It takes 50 seconds just to parse through 100% of the file:
>>>>>>>>> time ra -nr argus.log - ip > /dev/null
>>>>>>>>> 
>>>>>>>>> real    0m50.257s
>>>>>>>>> user    0m48.599s
>>>>>>>>> sys     0m1.596s
>>>> 
>


More information about the argus mailing list