Rewrite of rastream

Carter Bullard carter at qosient.com
Tue Jun 4 13:43:23 EDT 2013


Sounds like some progress.  That's good.
Not sure that I know what you mean by "the file with excess data is still generate".
What do you expect to happen, and what is it doing?

Carter 

On Jun 4, 2013, at 1:34 PM, Matt Brown <matthewbrown at gmail.com> wrote:

> Carter,
> 
> -d no longer causes the script to execute every second.  However the file with excess data is still generate when the script is executed.  racluster has not yet logged a segfault in syslog.
> 
> 
> Thanks,
> 
> Matt
> 
> On Jun 3, 2013, at 11:17 AM, Carter Bullard <carter at qosient.com> wrote:
> 
>> Hey Matt,
>> Don't forget to give the new rastream a run.  I have had a stability issue with 
>> this new rastream(), on one of my systems, so please give this one a try, and
>> yell back if you get what you want, or not !!!!  I'll want to release argus-clients-3.0.8
>> soon after this is done.
>> 
>> Hope all is most excellent,
>> 
>> Carter
>> 
>> On May 31, 2013, at 2:07 AM, Carter Bullard <carter at qosient.com> wrote:
>> 
>>> Hey Matt,
>>> OK, so here is a complete rewrite of rastream() that should fix your problem,
>>> but if it doesn't holler.  Just replace the file ./examples/rastream/rastream.c
>>> with the included file and recompile.
>>> 
>>> There might be a minor memory leak, but I'll do that testing tomorrow.
>>> 
>>> Carter
>>> 
>>> <rastream.c>
>>> 
>>> 
>>> On May 30, 2013, at 4:15 PM, Carter Bullard <carter at qosient.com> wrote:
>>> 
>>>> Well, that is a new twist, and will explain all your problems, running
>>>> the script every second, …, that isn't correct.  rastream() doesn't do this on
>>>> my machine, but I'll have to pay attention to test that …...
>>>> 
>>>> Just so you will know, when we, or anyone for that matter, does the
>>>> UNIX daemon thing,  they must change their directory to a known
>>>> place in the filesystem, which traditionally is "/", so the " -d " operation
>>>> always has an effect on the environment, and that is about the only
>>>> thing that is different.
>>>> 
>>>> Carter
>>>> 
>>>> 
>>>> 
>>>> On May 30, 2013, at 3:53 PM, Matt Brown <matthewbrown at gmail.com> wrote:
>>>> 
>>>>> Hello Carter,
>>>>>  
>>>>> Thanks for persistently looking into this issue.
>>>>>  
>>>>> I'm happy to a guinea pig for the new version, just advise where I can locate the test version.
>>>>>  
>>>>> Unforunately, I seriously just started using argus four weeks ago (after my short stint with it last year), so I can't make any serious recommendations at this time.  Although, having some functionality of ralabel would be nice (like cco)?  But maybe that's the point of client separation.
>>>>>  
>>>>>  
>>>>> I hate to be so stubborn :) but I don't think there's anything wrong with the environment.
>>>>>  
>>>>> The contents of the script are that of ../clients/support/Config/rastream.sh  No more, no less.
>>>>>  
>>>>> It _is_ worth noting (again) that this does only happen when I used the daemon forking mechanism (-d) to start rastream.
>>>>>  
>>>>> It sounds like you hit whatever nail on whatever head, and I look forward to testing the new version!
>>>>>  
>>>>>  
>>>>> So, fine!  Even though I don't believe it's the environment, I've added the following lines to rastream.sh befow the `racluster -M replace -r $FILES`:
>>>>>  
>>>>> echo $(date) " which racluster: " $(which racluster) &>> /var/log/rastreamtest.log
>>>>> echo $(date) " \$FILES: " $FILES &>> /var/log/rastreamtest.log
>>>>>  
>>>>>  
>>>>> What I see mildly astounds me:
>>>>>  
>>>>> /usr/local/bin/rastream -d -S 127.0.0.1:561 -B 15s -M time 1m -w /var/opt/argus/%Y-%m-%d/argus_%T -f /usr/local/bin/rastream.sh -F /etc/rastream.conf
>>>>>  
>>>>> There is a log entries in /var/log/rastreamtest.log for every single second.
>>>>>  
>>>>> Thu May 30 15:43:21 EDT 2013  which racluster:  /usr/local/bin/racluster
>>>>> Thu May 30 15:43:21 EDT 2013  $FILES:  /var/opt/argus/2013-05-30/argus_15:42:00
>>>>> gzip: /var/opt/argus/2013-05-30/argus_15:42:00.gz already exists;       not overwritten
>>>>> Thu May 30 15:43:22 EDT 2013  which racluster:  /usr/local/bin/racluster
>>>>> Thu May 30 15:43:22 EDT 2013  $FILES:  /var/opt/argus/2013-05-30/argus_15:42:00
>>>>> gzip: /var/opt/argus/2013-05-30/argus_15:42:00.gz already exists;       not overwritten
>>>>>  
>>>>>  
>>>>>  
>>>>> nohup /usr/local/bin/rastream -S 127.0.0.1:561 -B 15s -M time 1m -w /var/opt/argus/%Y-%m-%d/argus_%T -f /usr/local/bin/rastream.sh -F /etc/rastream.conf &
>>>>>  
>>>>> There is a log entry as expected, once a minute:
>>>>> Thu May 30 15:47:17 EDT 2013  which racluster:  /usr/local/bin/racluster
>>>>> Thu May 30 15:47:17 EDT 2013  $FILES:  /var/opt/argus/2013-05-30/argus_15:46:00
>>>>> Thu May 30 15:48:17 EDT 2013  which racluster:  /usr/local/bin/racluster
>>>>> Thu May 30 15:48:17 EDT 2013  $FILES:  /var/opt/argus/2013-05-30/argus_15:47:00
>>>>>  
>>>>>  
>>>>> So regardless, the -d logic and the -f logic do not like each other one bit.
>>>>>  
>>>>> Let me know what you think, and I'll await the new version to test!
>>>>>  
>>>>>  
>>>>> Thanks!
>>>>>  
>>>>> Matt
>>>>> 
>>>>> On May 30, 2013, at 11:33 AM, Carter Bullard <carter at qosient.com> wrote:
>>>>> 
>>>>>> Hey Matt,
>>>>>> OK, so I've found a pretty bad bug in rastream() and rasqlinsert(), based on
>>>>>> your descriptions, but I'm sorry that what I've found is not supportive of
>>>>>> your thesis.  This problem needs to be fixed, but existing implementations
>>>>>> can continue if you are not experiencing problems.
>>>>>> 
>>>>>> OK, the bug is centered around the core tech in rastream(), which was
>>>>>> modeled after rabins(), rather than rasplit().  rabins() has been reworked,
>>>>>> and its working very well, but rastream() is a little behind those changes.
>>>>>> What I'm seeing is flows not being split correctly at the end of a time
>>>>>> period, so that a days data can encroach into the next day by a few
>>>>>> seconds.  This doesn't work for me, at all, so I need to work this out.
>>>>>> 
>>>>>> As a result, I am doing a complete re-write to get rastream() back to
>>>>>> its original design, and should be done Fri/Mon.  I suspect that it will
>>>>>> solve any problems you are having, so are you, will you be willing to
>>>>>> be a guinea pig for this?
>>>>>> 
>>>>>> Since I'm rewriting it based on rasplit(), now would be the time to put
>>>>>> some new feature requests if there are any, and any discussions as
>>>>>> to what sucks and what works in rastream() would be great !!!!!
>>>>>> 
>>>>>> We still need to figure out what is wrong with your environment,
>>>>>> as I think that is at the root of some of what you're seeing.  You don't have
>>>>>> any options that are causing your rastream() to chroot(), like a
>>>>>> "-c" option?
>>>>>> 
>>>>>> Carter
>>>>>> 
>>>>>> 
>>>>>> On May 29, 2013, at 3:23 PM, Carter Bullard <carter at qosient.com> wrote:
>>>>>> 
>>>>>>> Hey Matt,
>>>>>>> Hmmmmmmmm, well, we need to get on the same page.  Based on the last
>>>>>>> set of information, I don't think that you are running the correct code, as you
>>>>>>> are reporting segfault information without symbols, which means you're running
>>>>>>> the wrong set of software.  If you are running the wrong software there, not
>>>>>>> sure that you are running the right software anywhere.  I can give racluster()
>>>>>>> bad filenames, files without permissions, etc…. and it doesn't segfault…..
>>>>>>> 
>>>>>>> So we have to clean things up, then we can hypothesis what the real problem is.
>>>>>>> 
>>>>>>> Now, I have been digging into rastream() and rasqlinsert() code and I've found
>>>>>>> an issue that may be related, but I need to have some certainty with regard to
>>>>>>> your setup, before I take the leap to say that I have a strategy…...
>>>>>>> 
>>>>>>> So, we need to understand what is going on with your rastream() script.
>>>>>>> One thing to do is to simplify the script, have it do some extensive logging
>>>>>>> to a file in /tmp, or /var/tmp or somewhere you like, so we can see what its
>>>>>>> doing……  Like printing out its environment, what it thinks is its root directory,
>>>>>>> its path, etc…...
>>>>>>> 
>>>>>>> Carter
>>>>>>> 
>>>>>>> 
>>>>>>> On May 29, 2013, at 1:43 PM, Matt Brown <matthewbrown at gmail.com> wrote:
>>>>>>> 
>>>>>>>> Hello carter,
>>>>>>>> 
>>>>>>>> Thanks for replying.
>>>>>>>> 
>>>>>>>> Rastream and racluster are 3.0.7.10.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Are you able to reproduce the issue when using the -d switch with rastream?   I can reproduce it on other systems.
>>>>>>>> 
>>>>>>>> What do you think of my theory?
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Hope this helps,
>>>>>>>> 
>>>>>>>> Matt
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On May 29, 2013, at 1:34 PM, Carter Bullard <carter at qosient.com> wrote:
>>>>>>>> 
>>>>>>>>> Hey Matt,
>>>>>>>>> So where are we on this?  Its clear that you are not running the version of code
>>>>>>>>> that you think you are running, so we need to clear that up before proceeding
>>>>>>>>> any further ?
>>>>>>>>> 
>>>>>>>>> Carter
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On May 22, 2013, at 4:07 PM, Matt Brown <matthewbrown at gmail.com> wrote:
>>>>>>>>> 
>>>>>>>>>> Sorry about that, I was trying to be verbose in case you spotted something interesting that indicated a clear problem.
>>>>>>>>>> 
>>>>>>>>>> Thanks for getting back to me.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 1) racluster segfaulting:
>>>>>>>>>> 
>>>>>>>>>> racluster is executed via the script assigned by the -f parameter of the rastream instance (/usr/local/bin/rastream.sh).
>>>>>>>>>> 
>>>>>>>>>> When racluster segfaults, its reporting the following to syslog:
>>>>>>>>>> May 22 01:00:18 ny-sentinel kernel: racluster[5558]: segfault at 46 ip 002e1877 sp bfb54e10 error 4 in libc-2.12.so[281000+190000]
>>>>>>>>>> May 22 02:00:18 ny-sentinel racluster[6044]: 02:00:18.586910 open '/var/opt/argus/2013-05-22/argus_01:00:00': No such file or directory
>>>>>>>>>> May 22 02:00:18 ny-sentinel kernel: racluster[6044]: segfault at 46 ip 0020d877 sp bf854d50 error 4 in libc-2.12.so[1ad000+190000]
>>>>>>>>>> May 22 03:00:18 ny-sentinel racluster[6520]: 03:00:18.534326 open '/var/opt/argus/2013-05-22/argus_02:00:00': No such file or directory
>>>>>>>>>> May 22 03:00:18 ny-sentinel kernel: racluster[6520]: segfault at 46 ip 0019c877 sp bfaf2b30 error 4 in libc-2.12.so[13c000+190000]
>>>>>>>>>> May 22 06:00:18 ny-sentinel racluster[8015]: 06:00:18.786627 open '/var/opt/argus/2013-05-22/argus_05:00:00': No such file or directory
>>>>>>>>>> May 22 06:00:18 ny-sentinel kernel: racluster[8015]: segfault at 46 ip 002a2877 sp bfa74270 error 4 in libc-2.12.so[242000+190000]
>>>>>>>>>> May 22 07:00:18 ny-sentinel racluster[8517]: 07:00:18.386497 open '/var/opt/argus/2013-05-22/argus_06:00:00': No such file or directory
>>>>>>>>>> May 22 07:00:18 ny-sentinel kernel: racluster[8517]: segfault at 46 ip 0073c877 sp bfaeee10 error 4 in libc-2.12.so[6dc000+190000]
>>>>>>>>>> May 22 08:00:18 ny-sentinel racluster[9006]: 08:00:18.591348 open '/var/opt/argus/2013-05-22/argus_07:00:00': No such file or directory
>>>>>>>>>> May 22 08:00:18 ny-sentinel kernel: racluster[9006]: segfault at 46 ip 0023e877 sp bfe6f110 error 4 in libc-2.12.so[1de000+190000]
>>>>>>>>>> May 22 10:00:19 ny-sentinel racluster[10009]: 10:00:19.991831 open '/var/opt/argus/2013-05-22/argus_09:00:00': No such file or directory
>>>>>>>>>> May 22 10:00:19 ny-sentinel kernel: racluster[10009]: segfault at 46 ip 00627877sp bfd4cf50 error 4 in libc-2.12.so[5c7000+190000]
>>>>>>>>>> May 22 13:00:20 ny-sentinel racluster[11556]: 13:00:20.585851 open '/var/opt/argus/2013-05-22/argus_12:00:00': No such file or directory
>>>>>>>>>> May 22 13:00:20 ny-sentinel kernel: racluster[11556]: segfault at 46 ip 002bb877 sp bfda4480 error 4 in libc-2.12.so[25b000+190000]
>>>>>>>>>> May 22 14:00:20 ny-sentinel racluster[12106]: 14:00:20.585760 open '/var/opt/argus/2013-05-22/argus_13:00:00': No such file or directory
>>>>>>>>>> May 22 14:00:20 ny-sentinel kernel: racluster[12106]: segfault at 46 ip 00c3b877 sp bfbebb40 error 4 in libc-2.12.so[bdb000+190000]
>>>>>>>>>> May 22 14:00:20 ny-sentinel racluster[12109]: 14:00:20.997232 open '/var/opt/argus/2013-05-22/argus_13:00:00': No such file or directory
>>>>>>>>>> May 22 14:00:20 ny-sentinel kernel: racluster[12109]: segfault at 46 ip 001ba877 sp bf838800 error 4 in libc-2.12.so[15a000+190000]
>>>>>>>>>> May 22 15:00:19 ny-sentinel racluster[12619]: 15:00:19.898971 open '/var/opt/argus/2013-05-22/argus_14:00:00': No such file or directory
>>>>>>>>>> May 22 15:00:19 ny-sentinel kernel: racluster[12619]: segfault at 46 ip 001d4877 sp bf865ab0 error 4 in libc-2.12.so[174000+190000]
>>>>>>>>>> 
>>>>>>>>>> Note that there is no easily identifiable pattern in which files cause the segfault and which do not, but the problem occurs with every file.
>>>>>>>>>> 
>>>>>>>>>> I can only guess that it is segfaulting because it can't open '/var/opt/argus/N/argus_N' (No such file or directory).
>>>>>>>>>> It's worth noting that in the instance mentioned in the previous email, '/var/opt/argus/2013-05-21/argus_10:00:00' did exist at the time racluster is executed.  I am concluding this because after racluster is executed, gzip compresses '/var/opt/argus/2013-05-21/argus_10:00:00' successfully.
>>>>>>>>>> 
>>>>>>>>>> I don't believe the segfault is the problem.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 2) Can I run racluster against the file and get good results?
>>>>>>>>>> 
>>>>>>>>>> Yes, I can run racluster against both the gzipped resultant file, and the file that is generated by rastream that wasn't gzipped.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 3) Multiple client versions:
>>>>>>>>>> 
>>>>>>>>>> When I mention modifying which racluster is executed by the shell script (executed by rastream's -f argument), I simply was attempting to have it use the racluster binary that I had compiled with support for debugging and symbols.  My reasoning was so that it would produce more verbose output to syslog, which it didn't.
>>>>>>>>>> 
>>>>>>>>>> There are only two sets of binaries located on the system.  One that has been compiled with debugging support and one that hasn't:
>>>>>>>>>> # find / -iname racluster
>>>>>>>>>> /root/argus-clients-3.0.7.10/bin/racluster #debugging support
>>>>>>>>>> /usr/local/bin/racluster #no debugging support
>>>>>>>>>> 
>>>>>>>>>> /usr/local/bin/rastream is executed and is writing the files I am concerned with (and is causing the problem).
>>>>>>>>>> /usr/local/bin/racluster is currently executed by /usr/local/bin/rastream.sh due to its location in the $PATH (and is the one that is segfaulting occasionally).
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 4) How I see the problem:
>>>>>>>>>> 
>>>>>>>>>> I don't want to make any assumptions, or create a red herring, since I am not familiar with the code of rastream, but I believe that the following is happening:
>>>>>>>>>> 
>>>>>>>>>> 1) `/usr/local/bin/rastream -d -S 127.0.0.1:561 -B 15s -M time 1h -w /var/opt/argus/%Y-%m-%d/argus_%T -f /usr/local/bin/rastream.sh` is executed, and then (due to -d) the daemonizing method is called and forks another rastream process and starts writing to /var/opt/argus/Y-m-d/argus_T1. (which is good)
>>>>>>>>>> 2) At `time 1h`, `/usr/local/bin/rastream.sh` is forked, without blocking `rastream` from writing. (good)
>>>>>>>>>> 3) While #2 occurs, rastream is _still_ writing to /var/opt/argus/Y-m-d/argus_T1 (the file from the previous `time 1h` time span, not the new `time 1h` time span). (which is bad)
>>>>>>>>>> 4) `/usr/local/bin/rastream.sh` executes `racluster -M replace -r $FILES` against the file /var/opt/argus/Y-m-d/argus_T1. (bad)
>>>>>>>>>> 5) `/usr/local/bin/rastream.sh` executes `gzip $FILES` and is able to add /var/opt/argus/Y-m-d/argus_T1 (bad)
>>>>>>>>>> 6) `rastream` begins writing /var/opt/argus/Y-m-d/argus_T2 (the new `time 1h` file) (bad)
>>>>>>>>>> 
>>>>>>>>>> So, the daemonized fork of rastream doesn't start writing to the new file (/var/opt/argus/Y-m-d/argus_T2) until _after_ it executes the -f designated script `/usr/local/bin/rastream.sh`.
>>>>>>>>>> Instead, what should be happening is: the daemonized fork of rastream should start writing to the new file, and _then_ execute the -f designated script `/usr/local/bin/rastream.sh`.
>>>>>>>>>> 
>>>>>>>>>> It would seem that the way that rastream is daemonized affects the later -f logic.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Hope this helps.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Thanks for your time,
>>>>>>>>>> 
>>>>>>>>>> Matt
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On May 22, 2013, at 10:45 AM, Carter Bullard <carter at qosient.com> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Hey Matt,
>>>>>>>>>>> Hmmm, so your emails are waaayyyyy too dense for dealing with bugs / problems.
>>>>>>>>>>> Lets focus on one thing at a time.
>>>>>>>>>>> 
>>>>>>>>>>> Your racluster() is segfaulting.  Why is it doing that?  Can you run racluster()
>>>>>>>>>>> against the file and get good results ?  Do you have multiple versions of the
>>>>>>>>>>> clients installed on your machine, so that there is ambiguity in which racluster()
>>>>>>>>>>> or rastream() is actually running?  We should clear that up before proceeding
>>>>>>>>>>> any further.
>>>>>>>>>>> 
>>>>>>>>>>> Carter
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On May 21, 2013, at 1:31 PM, Matt Brown <matthewbrown at gmail.com> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> Hello Carter,
>>>>>>>>>>>>  
>>>>>>>>>>>> Thanks for getting back to me quickly.
>>>>>>>>>>>>  
>>>>>>>>>>>> I've tried the following to attempt to troubleshoot the problem, but it doesn't appear that anything's outstanding.
>>>>>>>>>>>>  
>>>>>>>>>>>> Apologies for the verbosity…
>>>>>>>>>>>>  
>>>>>>>>>>>>  
>>>>>>>>>>>> # cd ./argus-clients-3.0.7.10
>>>>>>>>>>>> # touch .devel .debug
>>>>>>>>>>>> # ./configure
>>>>>>>>>>>> # make clean && make
>>>>>>>>>>>> # kill -2 $(pgrep rastream)
>>>>>>>>>>>> # ./bin/rastream -d -F /etc/rastream.conf -S 127.0.0.1:561 -B 15s -M time 1h -w /var/opt/argus/%Y-%m-%d/argus_%T -f /usr/local/bin/rastream.sh -D4
>>>>>>>>>>>> rastream[29500.c0567db7]: 2013-05-21 10:54:31.455541 ArgusNewQueue () returning 0x8609548
>>>>>>>>>>>> rastream[29500.c0567db7]: 2013-05-21 10:54:31.455612 ArgusNewHashTable (65536) returning 0x8609bc0
>>>>>>>>>>>> rastream[29500.c0567db7]: 2013-05-21 10:54:31.455632 ArgusNewList () returning 0x860a010
>>>>>>>>>>>> rastream[29500.c0567db7]: 2013-05-21 10:54:31.455695 ArgusNewQueue () returning 0x8611878
>>>>>>>>>>>> rastream[29500]: 2013-05-21 10:54:31.455729 started
>>>>>>>>>>>>  
>>>>>>>>>>>> syslog reports:
>>>>>>>>>>>> May 21 10:54:31 ny-sentinel rastream[29500]: 2013-05-21 10:54:31.455729 started
>>>>>>>>>>>> May 21 11:00:18 ny-sentinel racluster[29544]: 2013-05-21 11:00:18.635754 open '/var/opt/argus/2013-05-21/argus_10:00:00': No such file or directory
>>>>>>>>>>>> May 21 11:00:18 ny-sentinel kernel: racluster[29544]: segfault at 46 ip 001d9877 sp bfb9f200 error 4 in libc-2.12.so[179000+190000]
>>>>>>>>>>>>  
>>>>>>>>>>>> With /usr/local/bin/rastream.sh executing `racluster -M replace -r $FILES`, using the racluster in $PATH (as it is originally).
>>>>>>>>>>>>  
>>>>>>>>>>>> Note that '/var/opt/argus/2013-05-21/argus_10:00:00' did exist at the time [but that doesn't mean that racluster can read it ?].
>>>>>>>>>>>>  
>>>>>>>>>>>>  
>>>>>>>>>>>>  
>>>>>>>>>>>> This is interesting:
>>>>>>>>>>>> I modifed /usr/local/bin/rastream.sh to execute `[path]/argus-clients-3.0.7.10/bin/racluster -D4 -M replace -r $FILES` instead of just using the $PATHed racluster…
>>>>>>>>>>>>  
>>>>>>>>>>>> # kill -2 $(pgrep rastream)
>>>>>>>>>>>> # ./argus-clients-3.0.7.10/bin/rastream -d -F /etc/rastream.conf -S 127.0.0.1:561 -B 15s -M time 10m -w /var/opt/argus/%Y-%m-%d/argus_%T -f /usr/local/bin/rastream.sh -D4
>>>>>>>>>>>> rastream[30131.c0f67cb7]: 2013-05-21 11:26:53.528909 ArgusNewQueue () returning 0x9e77550
>>>>>>>>>>>> rastream[30131.c0f67cb7]: 2013-05-21 11:26:53.528982 ArgusNewHashTable (65536) returning 0x9e77bc8
>>>>>>>>>>>> rastream[30131.c0f67cb7]: 2013-05-21 11:26:53.529008 ArgusNewList () returning 0x9e78018
>>>>>>>>>>>> rastream[30131.c0f67cb7]: 2013-05-21 11:26:53.529072 ArgusNewQueue () returning 0x9e7f880
>>>>>>>>>>>> rastream[30131]: 2013-05-21 11:26:53.529102 started
>>>>>>>>>>>>  
>>>>>>>>>>>> syslog reports:
>>>>>>>>>>>> May 21 11:26:53 ny-sentinel rastream[30131]: 2013-05-21 11:26:53.529102 started
>>>>>>>>>>>>  
>>>>>>>>>>>> …and nothing from racluster or anything else at the 10 minute mark, even though the rotation (and problem) still occur:
>>>>>>>>>>>>  
>>>>>>>>>>>> # ll --full-time /var/opt/argus/2013-05-21/ | tail -n 5
>>>>>>>>>>>> -rw-r--r--. 1 root  159764 2013-05-21 11:30:59.466174549 -0400 argus_11:20:00
>>>>>>>>>>>> -rw-r--r--. 1 root  384117 2013-05-21 11:30:17.619561000 -0400 argus_11:20:00.gz
>>>>>>>>>>>> -rw-r--r--. 1 root  181472 2013-05-21 11:41:00.721865476 -0400 argus_11:30:00
>>>>>>>>>>>> -rw-r--r--. 1 root  394783 2013-05-21 11:40:17.511247000 -0400 argus_11:30:00.gz
>>>>>>>>>>>> -rw-r--r--. 1 root 1421440 2013-05-21 11:48:55.328587722 -0400 argus_11:40:00
>>>>>>>>>>>>  
>>>>>>>>>>>>  
>>>>>>>>>>>>  
>>>>>>>>>>>> Reverting rastream.sh to execute racluster in the $PATH (as original) and changing back to 'time 1h' as:
>>>>>>>>>>>> # kill -2 $(pgrep rastream)
>>>>>>>>>>>> # ./argus-clients-3.0.7.10/bin/rastream -d -F /etc/rastream.conf -S 127.0.0.1:561 -B 15s -M time 1h -w /var/opt/argus/%Y-%m-%d/argus_%T -f /usr/local/bin/rastream.sh -D4
>>>>>>>>>>>>  
>>>>>>>>>>>> results in syslog:
>>>>>>>>>>>> May 21 12:05:37 ny-sentinel rastream[30905]: 2013-05-21 12:05:37.950988 started
>>>>>>>>>>>>  
>>>>>>>>>>>> …and nothing from racluster (why and how?)), but the problem still exists:
>>>>>>>>>>>>  
>>>>>>>>>>>> # ll --full-time /var/opt/argus/2013-05-21/ | tail -n 3
>>>>>>>>>>>> -rw-r--r--. 1 root  145612 2013-05-21 13:01:00.562634873 -0400 argus_12:00:00
>>>>>>>>>>>> -rw-r--r--. 1 root 2303564 2013-05-21 13:00:18.216161000 -0400 argus_12:00:00.gz
>>>>>>>>>>>> -rw-r--r--. 1 root  315520 2013-05-21 13:02:35.126692699 -0400 argus_13:00:00
>>>>>>>>>>>>  
>>>>>>>>>>>> It's quite obvious that the size of the argus data has increased a very large amount (due to the debugging).
>>>>>>>>>>>>  
>>>>>>>>>>>>  
>>>>>>>>>>>> Does anything here indicate the problem?
>>>>>>>>>>>>  
>>>>>>>>>>>> Is there anything else that I can do to further troubleshoot the issue?
>>>>>>>>>>>>  
>>>>>>>>>>>>  
>>>>>>>>>>>>  
>>>>>>>>>>>>  
>>>>>>>>>>>> Here's some freaky stuff:
>>>>>>>>>>>>  
>>>>>>>>>>>> querying the file argus_12:00:00 for ltime doesn't correlate with the last modified time of the file:
>>>>>>>>>>>>  
>>>>>>>>>>>> # ra -nr argus_12\:00\:00 -s stime ltime -w - | rasort -m stime -s stime ltime | head -n 3
>>>>>>>>>>>>                  StartTime                   LastTime
>>>>>>>>>>>> 2013-05-21 12:59:03.164195 2013-05-21 12:59:06.417385
>>>>>>>>>>>> 2013-05-21 12:59:03.369926 2013-05-21 12:59:03.501171
>>>>>>>>>>>>  
>>>>>>>>>>>> # ra -nr argus_12\:00\:00 -s stime ltime -w - | rasort -m ltime -s stime ltime | tail -n 2
>>>>>>>>>>>> 2013-05-21 12:59:17.117235 2013-05-21 13:00:00.000000
>>>>>>>>>>>> 2013-05-21 12:59:28.896645 2013-05-21 13:00:00.000000
>>>>>>>>>>>>  
>>>>>>>>>>>> # ra -nr argus_12\:00\:00 -s stime ltime | head -n 3
>>>>>>>>>>>>                  StartTime                   LastTime
>>>>>>>>>>>> 2013-05-21 12:05:38.036319 2013-05-21 09:52:57.532792 # this is weird.
>>>>>>>>>>>> 2013-05-21 12:59:27.970584 2013-05-21 12:59:36.325653
>>>>>>>>>>>>  
>>>>>>>>>>>>  
>>>>>>>>>>>>  
>>>>>>>>>>>> querying the file argus_12:00:00.gz:
>>>>>>>>>>>>  
>>>>>>>>>>>> sort by stime:
>>>>>>>>>>>> # ra -nr argus_12\:00\:00.gz -s stime ltime -w - | rasort -m stime -s stime ltime | head -n 3
>>>>>>>>>>>>                  StartTime                   LastTime
>>>>>>>>>>>> 2013-05-21 11:59:44.064216 2013-05-21 12:34:17.996512
>>>>>>>>>>>> 2013-05-21 11:59:44.509320 2013-05-21 11:59:59.639822
>>>>>>>>>>>>  
>>>>>>>>>>>> # ra -nr argus_12\:00\:00.gz -s stime ltime -w - | rasort -m stime -s stime ltime | tail -n 2
>>>>>>>>>>>> 2013-05-21 12:59:02.980166 2013-05-21 12:59:02.981078
>>>>>>>>>>>> 2013-05-21 12:59:02.980171 2013-05-21 12:59:02.981120
>>>>>>>>>>>>  
>>>>>>>>>>>>  
>>>>>>>>>>>> sort by ltime:
>>>>>>>>>>>> # ra -nr argus_12\:00\:00.gz -s ltime -w - | rasort -m ltime -s stime ltime | head -n 3
>>>>>>>>>>>>                  StartTime                   LastTime
>>>>>>>>>>>> 2013-05-21 11:59:46.214566 2013-05-21 11:59:46.314022
>>>>>>>>>>>> 2013-05-21 11:59:51.860594 2013-05-21 11:59:51.887852
>>>>>>>>>>>>  
>>>>>>>>>>>> # ra -nr argus_12\:00\:00.gz -s ltime -w - | rasort -m stime ltime -s stime ltime | tail -n 2
>>>>>>>>>>>> 2013-05-21 12:00:00.000000 2013-05-21 13:00:00.000000
>>>>>>>>>>>> 2013-05-21 12:38:12.914597 2013-05-21 13:00:00.000000
>>>>>>>>>>>>  
>>>>>>>>>>>>  
>>>>>>>>>>>>  
>>>>>>>>>>>>  
>>>>>>>>>>>> Thanks again,
>>>>>>>>>>>>  
>>>>>>>>>>>> Matt
>>>>>>>>>>>> 
>>>>>>>>>>>> On May 21, 2013, at 10:49 AM, Carter Bullard <carter at qosient.com> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>> Yes you have to compile in debug support.  We should turn on symbols, etc... as well.  In the clients root directory, try this:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> % touch .devel .debug
>>>>>>>>>>>>> % ./configure
>>>>>>>>>>>>> % make clean; make
>>>>>>>>>>>>> 
>>>>>>>>>>>>> And use the resulting binaries for the testing.
>>>>>>>>>>>>> Carter
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On May 21, 2013, at 10:08 AM, Matt Brown <matthewbrown at gmail.com> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Carter,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Do settings need to be changed at compile time in order to get debugging features?  I have started with -D4 and only the start time and forked time are logged.  I have confirmed that rastream is appending to the proper file.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Just to let you know *latest* is still 3.0.7.9, not 3.0.7.10.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thanks very much!
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Matt
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On May 21, 2013, at 9:55 AM, Carter Bullard <carter at qosient.com> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Try the newest client distribution, just to see if any of the changes we
>>>>>>>>>>>>>>> made to date affect the problem.  This may be a tough one to debug,
>>>>>>>>>>>>>>> but if I can recreate the issue, then we're on our way to happiness.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> We'll need you to turn on debug on your daemon, the output should
>>>>>>>>>>>>>>> go to the syslog, so testing that briefly may help a bit.  Run your rastream
>>>>>>>>>>>>>>> with the "-d -D4" options, and see if you get any debug info in your syslog.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Carter 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On May 21, 2013, at 9:48 AM, Matt Brown <matthewbrown at gmail.com> wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Good morning Carter!
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>> I wrote an init script and had some issues when starting rastream with the script targetting ~/.rarc even if I set $ARGUSHOME or $HOME within the script (on CentOS) (due to service's use of /bin/env).
>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>> So, the line I use within the init script is:
>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>> /usr/local/bin/rastream -d -F /etc/rastream.conf -S 127.0.0.1:561 -B 15s -M time 1h -w /var/opt/argus/%Y-%m-%d/argus_%T -f /usr/local/bin/rastream.sh
>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>> The same problem with timing occurs as had previously occur when relying on the full ~/.rarc.
>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>> Here is an etherpad with relevant information:https://etherpad.mozilla.org/dmoSdfQ9H4
>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>> # rastream --version
>>>>>>>>>>>>>>>> Rastream Version 3.0.7.9
>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>> Thanks very much Carter!
>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>> Matt
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> On May 21, 2013, at 6:44 AM, Carter Bullard <carter at qosient.com> wrote:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Hey Matt,
>>>>>>>>>>>>>>>>> So.......any chance you are setting the timezine in a rarc file somewhere, like /etc/rarc or in your yme directory ?
>>>>>>>>>>>>>>>>> What version are you running ?
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Carter
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> On May 20, 2013, at 2:42 PM, Matt Brown <matthewbrown at gmail.com> wrote:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Hello All:
>>>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>>> I am having a problem with rastream that's manifested itself when using the -f "shell script executor" argument to rotate files at 'time 1h'.
>>>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>>> If I run rastream as a daemon, then the script seems to run before the "hour" is over (and the "hour" is over at the incorrect time):
>>>>>>>>>>>>>>>>>> # rastream -d -S 127.0.0.1:561 -B 15s -M time 1h -w /var/opt/argus/%Y-%m-%d/argus_%T -f /usr/local/bin/rastream.sh
>>>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>>> A few hours' files look like:
>>>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>>> # ls --full-time /var/opt/argus/2013-05-18
>>>>>>>>>>>>>>>>>> total 3728
>>>>>>>>>>>>>>>>>> -rw-r--r--. 1 root  18752 2013-05-18 01:00:59.556839459 -0400 argus_00:00:00
>>>>>>>>>>>>>>>>>> -rw-r--r--. 1 root 160607 2013-05-18 01:00:17.793286000 -0400 argus_00:00:00.gz
>>>>>>>>>>>>>>>>>> -rw-r--r--. 1 root  12068 2013-05-18 02:00:59.619364943 -0400 argus_01:00:00
>>>>>>>>>>>>>>>>>> -rw-r--r--. 1 root 163409 2013-05-18 02:00:17.943700000 -0400 argus_01:00:00.gz
>>>>>>>>>>>>>>>>>> -rw-r--r--. 1 root   9032 2013-05-1803:01:00.579907536 -0400 argus_02:00:00
>>>>>>>>>>>>>>>>>> -rw-r--r--. 1 root 122920 2013-05-18 03:00:17.834317000 -0400 argus_02:00:00.gz
>>>>>>>>>>>>>>>>>> -rw-r--r--. 1 root  22092 2013-05-18 04:01:00.698357771 -0400 argus_03:00:00
>>>>>>>>>>>>>>>>>> -rw-r--r--. 1 root 122002 2013-05-18 04:00:17.835675000 -0400 argus_03:00:00.gz
>>>>>>>>>>>>>>>>>> -rw-r--r--. 1 root  17704 2013-05-18 05:01:00.450618851 -0400 argus_04:00:00
>>>>>>>>>>>>>>>>>> -rw-r--r--. 1 root 133212 2013-05-18 05:00:17.742040000 -0400 argus_04:00:00.gz
>>>>>>>>>>>>>>>>>> -rw-r--r--. 1 root  14592 2013-05-18 06:00:54.886285774 -0400 argus_05:00:00
>>>>>>>>>>>>>>>>>> -rw-r--r--. 1 root 160523 2013-05-18 06:00:17.562776000 -0400 argus_05:00:00.gz
>>>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>>> It looks like the gzipped file is last modified before the hour file, which leads me to believe that rastream isn't finished writing to the argus file before -f[] is executed.
>>>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>>> If I run rastream as follows, I have no problem:
>>>>>>>>>>>>>>>>>> # nohup rastream -S 127.0.0.1:561 -B 15s -M time 1h -w /var/opt/argus/%Y-%m-%d/argus_%T -f /usr/local/bin/rastream.sh &
>>>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>>> A few hours' files look like:
>>>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>>> #  ls --full-time /var/opt/argus/2013-05-20
>>>>>>>>>>>>>>>>>> total 5372
>>>>>>>>>>>>>>>>>> -rw-r--r--. 1 root 217245 2013-05-20 10:00:17.573908000 -0400 argus_09:00:00.gz
>>>>>>>>>>>>>>>>>> -rw-r--r--. 1 root   6377 2013-05-2011:00:17.762140000 -0400 argus_10:00:00.gz
>>>>>>>>>>>>>>>>>> -rw-r--r--. 1 root   9269 2013-05-2011:38:08.879810000 -0400 argus_11:00:00.gz
>>>>>>>>>>>>>>>>>> -rw-r--r--. 1 root    313 2013-05-2013:00:17.170958000 -0400 argus_12:00:00.gz
>>>>>>>>>>>>>>>>>> -rw-r--r--. 1 root   8965 2013-05-2014:00:17.540889000 -0400 argus_13:00:00.gz
>>>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>>> (early day over there)
>>>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>>> I have verified that the system time is correct and ntpd is running properly.
>>>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>>> Clients are 3.0.7.9.
>>>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>>> Carter, do you have any ideas?
>>>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>>> Matt
>>>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>>> 
>> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20130604/68bf698e/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 6837 bytes
Desc: not available
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20130604/68bf698e/attachment.bin>


More information about the argus mailing list