BUG - No immediate exit when pipeing ra output
elof2 at sentor.se
elof2 at sentor.se
Wed May 25 07:43:31 EDT 2011
Hi Carter.
I have some unfortunate news.
Ra Version 3.0.5.11 still won't terminate swiftly upon pipe closure on
linux:
time ra -nr argus2.log | head -1
real 0m6.984s
user 0m6.708s
sys 0m0.284s
/Elof
On Wed, 18 May 2011, elof2 at sentor.se wrote:
>
> Hi Carter!
> It seems that the bugfix to detect closure of the pipe still have issues, at
> least on linux.
>
> I've compiled argus-clients 3.0.5.10 on both linux and FreeBSD.
>
> On FreeBSD, ra immediately terminates on pipe closure.
> On linux however, ra keep processing the entire input file(s) even though the
> pipe has closed.
>
> Examples:
>
> FreeBSD:
> time ra -r argus.log | grep -q "^."
> real 0m0.015s
> user 0m0.009s
> sys 0m0.007s
>
> Linux:
> time ra -r argus.log | grep -q "^."
> real 0m6.555s
> user 0m6.284s
> sys 0m0.284s
>
>
>
> As a comparison, here is the time to process the entire argus.log:
>
> FreeBSD:
> time ra -r argus-.log > /dev/null
> real 0m5.909s
> user 0m5.804s
> sys 0m0.108s
>
> Linux:
> time ra -r argus.log > /dev/null
> real 0m6.461s
> user 0m6.204s
> sys 0m0.256s
>
>
> Conclusion:
> Both FreeBSD and Linux take approx 6 seconds to parse the entire logfile.
> On FreeBSD, pipe closure is detected correctly and ra terminates after approx
> 0.015s. Good.
> On linux, pipe closure seem not to be detected at all, and the *entire* file
> is processed. Ra terminates after approx 6s.
> (if the logfile is huge, it takes a loooong time before the ra process
> terminates)
>
>
> Please let me know if I can help with any additional details to solve this.
>
> /Elof
>
>
>
> On Wed, 11 May 2011, Carter Bullard wrote:
>
>> On my laptop, I am getting about 250K records per sec throughput with ra(),
>> which is a lot of records per day, but, it's not 1M records per sec, so
>> there is room for improvement.
>>
>> There is something in initialization that is very slow, and I'll look at
>> it. strace() may show what it is.
>>
>> Carter
>>
>> On May 11, 2011, at 4:47 AM, elof2 at sentor.se wrote:
>>
>>>
>>> On Tue, 10 May 2011, Carter Bullard wrote:
>>>> Not sure where the 288 files == 24 hours?
>>>
>>> For example I have a directory with 288 argus-files. Each file hold 5
>>> minutes worth of data. I.e. 288*5min=1440min=24h.
>>> That was just an example to illustrate that if I want to perform some
>>> action on the argus data for this day, the shortest execution time even if
>>> I use -qN1 will be 0.40s using ra v3, where ra v2 take 0.008s.
>>>
>>> Even though 0.40s is not a long time, it is 50 times longer than 0.008s.
>>> This is not a biggie, I just wanted to inform you about the big difference
>>> between v2 and v3.
>>>
>>>
>>>> ra() is threaded, and may read most of the file in before it starts to
>>>> write any output. You could remove the threads and see if it picks up.
>>>> % ./configure --without-threads; make clean; make
>>>
>>> I tried this just now. No change. The unthreaded ra still take 0.40s just
>>> to start and exit.
>>> time ra -qN1 -r argus1.log ... argus288.log
>>> real 0m0.406s
>>> user 0m0.220s
>>> sys 0m0.192s
>>>
>>> As I said, not a biggie. Just FYI.
>>>
>>>
>>>> I'll checkout why we may be processing all the input files if the output
>>>> pipe closes.
>>>
>>> Thanks! In my opinion this is a biggie.
>>>
>>> /Elof
>>>
>>>
>>>
>>>> On May 10, 2011, at 11:43 AM, elof2 at sentor.se wrote:
>>>>
>>>>>
>>>>> Hi Carter!
>>>>>
>>>>> It seems that the bugfix to detect closure of the pipe still have
>>>>> issues.
>>>>>
>>>>> In ra v2.x I was used to having the ra command terminate in 0.004s.
>>>>> With the new ra 3.0.5.8, it won't terminate as swiftly.
>>>>>
>>>>> ra v3.0.5.8:
>>>>> time ra -nr argus.log - > /dev/null
>>>>> real 0m4.499s
>>>>> user 0m4.288s
>>>>> sys 0m0.212s
>>>>> Time to parse through 100% of this test-file is apparently ~4.5s.
>>>>>
>>>>>
>>>>> Now lets see the time when grep or head terminates directly on the first
>>>>> line:
>>>>> time ra -nr argus.log - | grep -q "."
>>>>> real 0m4.562s
>>>>> user 0m4.392s
>>>>> sys 0m0.176s
>>>>> time ra -nr argus.log - | head -1
>>>>> real 0m4.588s
>>>>> user 0m4.428s
>>>>> sys 0m0.168s
>>>>>
>>>>> Both commands take the full 4.5s (plus a little more) to finish.
>>>>>
>>>>> If I add another (or the same) logfile, it will double the execution
>>>>> time:
>>>>> time ra -nr argus.log argus.log - | head -1
>>>>> real 0m9.002s
>>>>> user 0m8.617s
>>>>> sys 0m0.392s
>>>>>
>>>>> Conclusion:
>>>>> ra 3.0.5.8 don't immediately terminate upon pipe closure.
>>>>>
>>>>>
>>>>>
>>>>> Speaking of latency and execution times...
>>>>>
>>>>> Even when using the builtin -N1 option, ra v2 is way faster than v3:
>>>>> ra v2.x:
>>>>> time ra -N1 -nr argus.log -
>>>>> real 0m0.003s
>>>>> user 0m0.000s
>>>>> sys 0m0.000s
>>>>>
>>>>> ra 3.0.5.8:
>>>>> real 0m0.024s
>>>>> user 0m0.000s
>>>>> sys 0m0.021s
>>>>>
>>>>> v3 is 8 times slower than v2 when it comes to swift termination.
>>>>> Can this be improved?
>>>>>
>>>>>
>>>>> Also, no matter how many files I add to ra v2, it still terminates after
>>>>> 0.003 seconds where as ra v3 will increase its 0.024s by approx 0.002
>>>>> per file:
>>>>> ra v2.x:
>>>>> time ra -N1 -nr argus.log argus.log argus.log argus.log argus.log
>>>>> argus.log argus.log argus.log argus.log argus.log -
>>>>> real 0m0.003s <- still 0.003s
>>>>> user 0m0.000s
>>>>> sys 0m0.002s
>>>>>
>>>>> ra 3.0.5.8:
>>>>> real 0m0.032s <- increases more and more for every file added
>>>>> user 0m0.007s
>>>>> sys 0m0.024s
>>>>>
>>>>> If you have 288 argus files in a log dir and use ra -nr argus*.log to do
>>>>> some simple processing, you will always have a noticeable lag just for
>>>>> the sake of the 288 files, even if you use -N1. Can this be improved to
>>>>> be as swift as v2?
>>>>> (288 files á 5 minutes = 24 hours)
>>>>>
>>>>> /Elof
>>>>>
>>>>>
>>>>> On Fri, 29 Apr 2011, Carter Bullard wrote:
>>>>>
>>>>>> Hey Elof,
>>>>>> I have the closed piped issue fixed now. Its a small fix in every
>>>>>> client, so you should wait to get a new client release.
>>>>>
>>>>>> Carter
>>>>>
>>>>>
>>>>>
>>>>>>>>> Two examples using the old ra v2.x :
>>>>>>>>>
>>>>>>>>> 1)
>>>>>>>>> time ra -Zb -nr argus.log - ip | head -1
>>>>>>>>> 04-27-11 23:54:00.925018 s tcp 10.10.10.10.46382
>>>>>>>>> -> 20.20.20.20.1503 5467 5550 967561 2559850 PA_PA
>>>>>>>>>
>>>>>>>>> real 0m0.004s
>>>>>>>>> user 0m0.000s
>>>>>>>>> sys 0m0.000s
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2)
>>>>>>>>> time ra -Zb -nr argus-20110428.log - ip | grep -q "PA_PA$"
>>>>>>>>>
>>>>>>>>> real 0m0.004s
>>>>>>>>> user 0m0.000s
>>>>>>>>> sys 0m0.000s
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Both examples only take 0.004s to execute.
>>>>>>>>> Good!
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Now I try the same using ra v3 :
>>>>>>>>>
>>>>>>>>> 1)
>>>>>>>>> time ra -Zb -nr argus.log - ip | head -1
>>>>>>>>> 04-27-11 23:54:00.925018 s tcp 30.30.30.30.32451
>>>>>>>>> -> 40.40.40.40.25 52 97 1151 12274 PA_PA
>>>>>>>>>
>>>>>>>>> real 0m50.043s
>>>>>>>>> user 0m48.171s
>>>>>>>>> sys 0m1.892s
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2)
>>>>>>>>> time ra -Zb -nr argus-20110429.13.log - ip | grep -q "PA_PA$"
>>>>>>>>>
>>>>>>>>> real 0m50.161s
>>>>>>>>> user 0m48.103s
>>>>>>>>> sys 0m2.068s
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The commands take 50 seconds (!!!) to complete when they should have
>>>>>>>>> taken approx 0.004 seconds.
>>>>>>>>>
>>>>>>>>> The argus.log file is 96 MB and hold 828 372 rows of data.
>>>>>>>>> (The larger the file, the longer I have to wait until the command
>>>>>>>>> terminates. If I have an old argus.log holding 24 hours of data, it
>>>>>>>>> will take ~30 minutes to parse it (even longer if it is bzipped))
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> It takes 50 seconds just to parse through 100% of the file:
>>>>>>>>> time ra -nr argus.log - ip > /dev/null
>>>>>>>>>
>>>>>>>>> real 0m50.257s
>>>>>>>>> user 0m48.599s
>>>>>>>>> sys 0m1.596s
>>>>
>
More information about the argus
mailing list