Rewrite of rastream

Carter Bullard carter at qosient.com
Mon Jun 3 11:17:56 EDT 2013


Hey Matt,
Don't forget to give the new rastream a run.  I have had a stability issue with 
this new rastream(), on one of my systems, so please give this one a try, and
yell back if you get what you want, or not !!!!  I'll want to release argus-clients-3.0.8
soon after this is done.

Hope all is most excellent,

Carter

On May 31, 2013, at 2:07 AM, Carter Bullard <carter at qosient.com> wrote:

> Hey Matt,
> OK, so here is a complete rewrite of rastream() that should fix your problem,
> but if it doesn't holler.  Just replace the file ./examples/rastream/rastream.c
> with the included file and recompile.
> 
> There might be a minor memory leak, but I'll do that testing tomorrow.
> 
> Carter
> 
> <rastream.c>
> 
> 
> On May 30, 2013, at 4:15 PM, Carter Bullard <carter at qosient.com> wrote:
> 
>> Well, that is a new twist, and will explain all your problems, running
>> the script every second, …, that isn't correct.  rastream() doesn't do this on
>> my machine, but I'll have to pay attention to test that …...
>> 
>> Just so you will know, when we, or anyone for that matter, does the
>> UNIX daemon thing,  they must change their directory to a known
>> place in the filesystem, which traditionally is "/", so the " -d " operation
>> always has an effect on the environment, and that is about the only
>> thing that is different.
>> 
>> Carter
>> 
>> 
>> 
>> On May 30, 2013, at 3:53 PM, Matt Brown <matthewbrown at gmail.com> wrote:
>> 
>>> Hello Carter,
>>>  
>>> Thanks for persistently looking into this issue.
>>>  
>>> I'm happy to a guinea pig for the new version, just advise where I can locate the test version.
>>>  
>>> Unforunately, I seriously just started using argus four weeks ago (after my short stint with it last year), so I can't make any serious recommendations at this time.  Although, having some functionality of ralabel would be nice (like cco)?  But maybe that's the point of client separation.
>>>  
>>>  
>>> I hate to be so stubborn :) but I don't think there's anything wrong with the environment.
>>>  
>>> The contents of the script are that of ../clients/support/Config/rastream.sh  No more, no less.
>>>  
>>> It _is_ worth noting (again) that this does only happen when I used the daemon forking mechanism (-d) to start rastream.
>>>  
>>> It sounds like you hit whatever nail on whatever head, and I look forward to testing the new version!
>>>  
>>>  
>>> So, fine!  Even though I don't believe it's the environment, I've added the following lines to rastream.sh befow the `racluster -M replace -r $FILES`:
>>>  
>>> echo $(date) " which racluster: " $(which racluster) &>> /var/log/rastreamtest.log
>>> echo $(date) " \$FILES: " $FILES &>> /var/log/rastreamtest.log
>>>  
>>>  
>>> What I see mildly astounds me:
>>>  
>>> /usr/local/bin/rastream -d -S 127.0.0.1:561 -B 15s -M time 1m -w /var/opt/argus/%Y-%m-%d/argus_%T -f /usr/local/bin/rastream.sh -F /etc/rastream.conf
>>>  
>>> There is a log entries in /var/log/rastreamtest.log for every single second.
>>>  
>>> Thu May 30 15:43:21 EDT 2013  which racluster:  /usr/local/bin/racluster
>>> Thu May 30 15:43:21 EDT 2013  $FILES:  /var/opt/argus/2013-05-30/argus_15:42:00
>>> gzip: /var/opt/argus/2013-05-30/argus_15:42:00.gz already exists;       not overwritten
>>> Thu May 30 15:43:22 EDT 2013  which racluster:  /usr/local/bin/racluster
>>> Thu May 30 15:43:22 EDT 2013  $FILES:  /var/opt/argus/2013-05-30/argus_15:42:00
>>> gzip: /var/opt/argus/2013-05-30/argus_15:42:00.gz already exists;       not overwritten
>>>  
>>>  
>>>  
>>> nohup /usr/local/bin/rastream -S 127.0.0.1:561 -B 15s -M time 1m -w /var/opt/argus/%Y-%m-%d/argus_%T -f /usr/local/bin/rastream.sh -F /etc/rastream.conf &
>>>  
>>> There is a log entry as expected, once a minute:
>>> Thu May 30 15:47:17 EDT 2013  which racluster:  /usr/local/bin/racluster
>>> Thu May 30 15:47:17 EDT 2013  $FILES:  /var/opt/argus/2013-05-30/argus_15:46:00
>>> Thu May 30 15:48:17 EDT 2013  which racluster:  /usr/local/bin/racluster
>>> Thu May 30 15:48:17 EDT 2013  $FILES:  /var/opt/argus/2013-05-30/argus_15:47:00
>>>  
>>>  
>>> So regardless, the -d logic and the -f logic do not like each other one bit.
>>>  
>>> Let me know what you think, and I'll await the new version to test!
>>>  
>>>  
>>> Thanks!
>>>  
>>> Matt
>>> 
>>> On May 30, 2013, at 11:33 AM, Carter Bullard <carter at qosient.com> wrote:
>>> 
>>>> Hey Matt,
>>>> OK, so I've found a pretty bad bug in rastream() and rasqlinsert(), based on
>>>> your descriptions, but I'm sorry that what I've found is not supportive of
>>>> your thesis.  This problem needs to be fixed, but existing implementations
>>>> can continue if you are not experiencing problems.
>>>> 
>>>> OK, the bug is centered around the core tech in rastream(), which was
>>>> modeled after rabins(), rather than rasplit().  rabins() has been reworked,
>>>> and its working very well, but rastream() is a little behind those changes.
>>>> What I'm seeing is flows not being split correctly at the end of a time
>>>> period, so that a days data can encroach into the next day by a few
>>>> seconds.  This doesn't work for me, at all, so I need to work this out.
>>>> 
>>>> As a result, I am doing a complete re-write to get rastream() back to
>>>> its original design, and should be done Fri/Mon.  I suspect that it will
>>>> solve any problems you are having, so are you, will you be willing to
>>>> be a guinea pig for this?
>>>> 
>>>> Since I'm rewriting it based on rasplit(), now would be the time to put
>>>> some new feature requests if there are any, and any discussions as
>>>> to what sucks and what works in rastream() would be great !!!!!
>>>> 
>>>> We still need to figure out what is wrong with your environment,
>>>> as I think that is at the root of some of what you're seeing.  You don't have
>>>> any options that are causing your rastream() to chroot(), like a
>>>> "-c" option?
>>>> 
>>>> Carter
>>>> 
>>>> 
>>>> On May 29, 2013, at 3:23 PM, Carter Bullard <carter at qosient.com> wrote:
>>>> 
>>>>> Hey Matt,
>>>>> Hmmmmmmmm, well, we need to get on the same page.  Based on the last
>>>>> set of information, I don't think that you are running the correct code, as you
>>>>> are reporting segfault information without symbols, which means you're running
>>>>> the wrong set of software.  If you are running the wrong software there, not
>>>>> sure that you are running the right software anywhere.  I can give racluster()
>>>>> bad filenames, files without permissions, etc…. and it doesn't segfault…..
>>>>> 
>>>>> So we have to clean things up, then we can hypothesis what the real problem is.
>>>>> 
>>>>> Now, I have been digging into rastream() and rasqlinsert() code and I've found
>>>>> an issue that may be related, but I need to have some certainty with regard to
>>>>> your setup, before I take the leap to say that I have a strategy…...
>>>>> 
>>>>> So, we need to understand what is going on with your rastream() script.
>>>>> One thing to do is to simplify the script, have it do some extensive logging
>>>>> to a file in /tmp, or /var/tmp or somewhere you like, so we can see what its
>>>>> doing……  Like printing out its environment, what it thinks is its root directory,
>>>>> its path, etc…...
>>>>> 
>>>>> Carter
>>>>> 
>>>>> 
>>>>> On May 29, 2013, at 1:43 PM, Matt Brown <matthewbrown at gmail.com> wrote:
>>>>> 
>>>>>> Hello carter,
>>>>>> 
>>>>>> Thanks for replying.
>>>>>> 
>>>>>> Rastream and racluster are 3.0.7.10.
>>>>>> 
>>>>>> 
>>>>>> Are you able to reproduce the issue when using the -d switch with rastream?   I can reproduce it on other systems.
>>>>>> 
>>>>>> What do you think of my theory?
>>>>>> 
>>>>>> 
>>>>>> Hope this helps,
>>>>>> 
>>>>>> Matt
>>>>>> 
>>>>>> 
>>>>>> On May 29, 2013, at 1:34 PM, Carter Bullard <carter at qosient.com> wrote:
>>>>>> 
>>>>>>> Hey Matt,
>>>>>>> So where are we on this?  Its clear that you are not running the version of code
>>>>>>> that you think you are running, so we need to clear that up before proceeding
>>>>>>> any further ?
>>>>>>> 
>>>>>>> Carter
>>>>>>> 
>>>>>>> 
>>>>>>> On May 22, 2013, at 4:07 PM, Matt Brown <matthewbrown at gmail.com> wrote:
>>>>>>> 
>>>>>>>> Sorry about that, I was trying to be verbose in case you spotted something interesting that indicated a clear problem.
>>>>>>>> 
>>>>>>>> Thanks for getting back to me.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 1) racluster segfaulting:
>>>>>>>> 
>>>>>>>> racluster is executed via the script assigned by the -f parameter of the rastream instance (/usr/local/bin/rastream.sh).
>>>>>>>> 
>>>>>>>> When racluster segfaults, its reporting the following to syslog:
>>>>>>>> May 22 01:00:18 ny-sentinel kernel: racluster[5558]: segfault at 46 ip 002e1877 sp bfb54e10 error 4 in libc-2.12.so[281000+190000]
>>>>>>>> May 22 02:00:18 ny-sentinel racluster[6044]: 02:00:18.586910 open '/var/opt/argus/2013-05-22/argus_01:00:00': No such file or directory
>>>>>>>> May 22 02:00:18 ny-sentinel kernel: racluster[6044]: segfault at 46 ip 0020d877 sp bf854d50 error 4 in libc-2.12.so[1ad000+190000]
>>>>>>>> May 22 03:00:18 ny-sentinel racluster[6520]: 03:00:18.534326 open '/var/opt/argus/2013-05-22/argus_02:00:00': No such file or directory
>>>>>>>> May 22 03:00:18 ny-sentinel kernel: racluster[6520]: segfault at 46 ip 0019c877 sp bfaf2b30 error 4 in libc-2.12.so[13c000+190000]
>>>>>>>> May 22 06:00:18 ny-sentinel racluster[8015]: 06:00:18.786627 open '/var/opt/argus/2013-05-22/argus_05:00:00': No such file or directory
>>>>>>>> May 22 06:00:18 ny-sentinel kernel: racluster[8015]: segfault at 46 ip 002a2877 sp bfa74270 error 4 in libc-2.12.so[242000+190000]
>>>>>>>> May 22 07:00:18 ny-sentinel racluster[8517]: 07:00:18.386497 open '/var/opt/argus/2013-05-22/argus_06:00:00': No such file or directory
>>>>>>>> May 22 07:00:18 ny-sentinel kernel: racluster[8517]: segfault at 46 ip 0073c877 sp bfaeee10 error 4 in libc-2.12.so[6dc000+190000]
>>>>>>>> May 22 08:00:18 ny-sentinel racluster[9006]: 08:00:18.591348 open '/var/opt/argus/2013-05-22/argus_07:00:00': No such file or directory
>>>>>>>> May 22 08:00:18 ny-sentinel kernel: racluster[9006]: segfault at 46 ip 0023e877 sp bfe6f110 error 4 in libc-2.12.so[1de000+190000]
>>>>>>>> May 22 10:00:19 ny-sentinel racluster[10009]: 10:00:19.991831 open '/var/opt/argus/2013-05-22/argus_09:00:00': No such file or directory
>>>>>>>> May 22 10:00:19 ny-sentinel kernel: racluster[10009]: segfault at 46 ip 00627877sp bfd4cf50 error 4 in libc-2.12.so[5c7000+190000]
>>>>>>>> May 22 13:00:20 ny-sentinel racluster[11556]: 13:00:20.585851 open '/var/opt/argus/2013-05-22/argus_12:00:00': No such file or directory
>>>>>>>> May 22 13:00:20 ny-sentinel kernel: racluster[11556]: segfault at 46 ip 002bb877 sp bfda4480 error 4 in libc-2.12.so[25b000+190000]
>>>>>>>> May 22 14:00:20 ny-sentinel racluster[12106]: 14:00:20.585760 open '/var/opt/argus/2013-05-22/argus_13:00:00': No such file or directory
>>>>>>>> May 22 14:00:20 ny-sentinel kernel: racluster[12106]: segfault at 46 ip 00c3b877 sp bfbebb40 error 4 in libc-2.12.so[bdb000+190000]
>>>>>>>> May 22 14:00:20 ny-sentinel racluster[12109]: 14:00:20.997232 open '/var/opt/argus/2013-05-22/argus_13:00:00': No such file or directory
>>>>>>>> May 22 14:00:20 ny-sentinel kernel: racluster[12109]: segfault at 46 ip 001ba877 sp bf838800 error 4 in libc-2.12.so[15a000+190000]
>>>>>>>> May 22 15:00:19 ny-sentinel racluster[12619]: 15:00:19.898971 open '/var/opt/argus/2013-05-22/argus_14:00:00': No such file or directory
>>>>>>>> May 22 15:00:19 ny-sentinel kernel: racluster[12619]: segfault at 46 ip 001d4877 sp bf865ab0 error 4 in libc-2.12.so[174000+190000]
>>>>>>>> 
>>>>>>>> Note that there is no easily identifiable pattern in which files cause the segfault and which do not, but the problem occurs with every file.
>>>>>>>> 
>>>>>>>> I can only guess that it is segfaulting because it can't open '/var/opt/argus/N/argus_N' (No such file or directory).
>>>>>>>> It's worth noting that in the instance mentioned in the previous email, '/var/opt/argus/2013-05-21/argus_10:00:00' did exist at the time racluster is executed.  I am concluding this because after racluster is executed, gzip compresses '/var/opt/argus/2013-05-21/argus_10:00:00' successfully.
>>>>>>>> 
>>>>>>>> I don't believe the segfault is the problem.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 2) Can I run racluster against the file and get good results?
>>>>>>>> 
>>>>>>>> Yes, I can run racluster against both the gzipped resultant file, and the file that is generated by rastream that wasn't gzipped.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 3) Multiple client versions:
>>>>>>>> 
>>>>>>>> When I mention modifying which racluster is executed by the shell script (executed by rastream's -f argument), I simply was attempting to have it use the racluster binary that I had compiled with support for debugging and symbols.  My reasoning was so that it would produce more verbose output to syslog, which it didn't.
>>>>>>>> 
>>>>>>>> There are only two sets of binaries located on the system.  One that has been compiled with debugging support and one that hasn't:
>>>>>>>> # find / -iname racluster
>>>>>>>> /root/argus-clients-3.0.7.10/bin/racluster #debugging support
>>>>>>>> /usr/local/bin/racluster #no debugging support
>>>>>>>> 
>>>>>>>> /usr/local/bin/rastream is executed and is writing the files I am concerned with (and is causing the problem).
>>>>>>>> /usr/local/bin/racluster is currently executed by /usr/local/bin/rastream.sh due to its location in the $PATH (and is the one that is segfaulting occasionally).
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 4) How I see the problem:
>>>>>>>> 
>>>>>>>> I don't want to make any assumptions, or create a red herring, since I am not familiar with the code of rastream, but I believe that the following is happening:
>>>>>>>> 
>>>>>>>> 1) `/usr/local/bin/rastream -d -S 127.0.0.1:561 -B 15s -M time 1h -w /var/opt/argus/%Y-%m-%d/argus_%T -f /usr/local/bin/rastream.sh` is executed, and then (due to -d) the daemonizing method is called and forks another rastream process and starts writing to /var/opt/argus/Y-m-d/argus_T1. (which is good)
>>>>>>>> 2) At `time 1h`, `/usr/local/bin/rastream.sh` is forked, without blocking `rastream` from writing. (good)
>>>>>>>> 3) While #2 occurs, rastream is _still_ writing to /var/opt/argus/Y-m-d/argus_T1 (the file from the previous `time 1h` time span, not the new `time 1h` time span). (which is bad)
>>>>>>>> 4) `/usr/local/bin/rastream.sh` executes `racluster -M replace -r $FILES` against the file /var/opt/argus/Y-m-d/argus_T1. (bad)
>>>>>>>> 5) `/usr/local/bin/rastream.sh` executes `gzip $FILES` and is able to add /var/opt/argus/Y-m-d/argus_T1 (bad)
>>>>>>>> 6) `rastream` begins writing /var/opt/argus/Y-m-d/argus_T2 (the new `time 1h` file) (bad)
>>>>>>>> 
>>>>>>>> So, the daemonized fork of rastream doesn't start writing to the new file (/var/opt/argus/Y-m-d/argus_T2) until _after_ it executes the -f designated script `/usr/local/bin/rastream.sh`.
>>>>>>>> Instead, what should be happening is: the daemonized fork of rastream should start writing to the new file, and _then_ execute the -f designated script `/usr/local/bin/rastream.sh`.
>>>>>>>> 
>>>>>>>> It would seem that the way that rastream is daemonized affects the later -f logic.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Hope this helps.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Thanks for your time,
>>>>>>>> 
>>>>>>>> Matt
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On May 22, 2013, at 10:45 AM, Carter Bullard <carter at qosient.com> wrote:
>>>>>>>> 
>>>>>>>>> Hey Matt,
>>>>>>>>> Hmmm, so your emails are waaayyyyy too dense for dealing with bugs / problems.
>>>>>>>>> Lets focus on one thing at a time.
>>>>>>>>> 
>>>>>>>>> Your racluster() is segfaulting.  Why is it doing that?  Can you run racluster()
>>>>>>>>> against the file and get good results ?  Do you have multiple versions of the
>>>>>>>>> clients installed on your machine, so that there is ambiguity in which racluster()
>>>>>>>>> or rastream() is actually running?  We should clear that up before proceeding
>>>>>>>>> any further.
>>>>>>>>> 
>>>>>>>>> Carter
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On May 21, 2013, at 1:31 PM, Matt Brown <matthewbrown at gmail.com> wrote:
>>>>>>>>> 
>>>>>>>>>> Hello Carter,
>>>>>>>>>>  
>>>>>>>>>> Thanks for getting back to me quickly.
>>>>>>>>>>  
>>>>>>>>>> I've tried the following to attempt to troubleshoot the problem, but it doesn't appear that anything's outstanding.
>>>>>>>>>>  
>>>>>>>>>> Apologies for the verbosity…
>>>>>>>>>>  
>>>>>>>>>>  
>>>>>>>>>> # cd ./argus-clients-3.0.7.10
>>>>>>>>>> # touch .devel .debug
>>>>>>>>>> # ./configure
>>>>>>>>>> # make clean && make
>>>>>>>>>> # kill -2 $(pgrep rastream)
>>>>>>>>>> # ./bin/rastream -d -F /etc/rastream.conf -S 127.0.0.1:561 -B 15s -M time 1h -w /var/opt/argus/%Y-%m-%d/argus_%T -f /usr/local/bin/rastream.sh -D4
>>>>>>>>>> rastream[29500.c0567db7]: 2013-05-21 10:54:31.455541 ArgusNewQueue () returning 0x8609548
>>>>>>>>>> rastream[29500.c0567db7]: 2013-05-21 10:54:31.455612 ArgusNewHashTable (65536) returning 0x8609bc0
>>>>>>>>>> rastream[29500.c0567db7]: 2013-05-21 10:54:31.455632 ArgusNewList () returning 0x860a010
>>>>>>>>>> rastream[29500.c0567db7]: 2013-05-21 10:54:31.455695 ArgusNewQueue () returning 0x8611878
>>>>>>>>>> rastream[29500]: 2013-05-21 10:54:31.455729 started
>>>>>>>>>>  
>>>>>>>>>> syslog reports:
>>>>>>>>>> May 21 10:54:31 ny-sentinel rastream[29500]: 2013-05-21 10:54:31.455729 started
>>>>>>>>>> May 21 11:00:18 ny-sentinel racluster[29544]: 2013-05-21 11:00:18.635754 open '/var/opt/argus/2013-05-21/argus_10:00:00': No such file or directory
>>>>>>>>>> May 21 11:00:18 ny-sentinel kernel: racluster[29544]: segfault at 46 ip 001d9877 sp bfb9f200 error 4 in libc-2.12.so[179000+190000]
>>>>>>>>>>  
>>>>>>>>>> With /usr/local/bin/rastream.sh executing `racluster -M replace -r $FILES`, using the racluster in $PATH (as it is originally).
>>>>>>>>>>  
>>>>>>>>>> Note that '/var/opt/argus/2013-05-21/argus_10:00:00' did exist at the time [but that doesn't mean that racluster can read it ?].
>>>>>>>>>>  
>>>>>>>>>>  
>>>>>>>>>>  
>>>>>>>>>> This is interesting:
>>>>>>>>>> I modifed /usr/local/bin/rastream.sh to execute `[path]/argus-clients-3.0.7.10/bin/racluster -D4 -M replace -r $FILES` instead of just using the $PATHed racluster…
>>>>>>>>>>  
>>>>>>>>>> # kill -2 $(pgrep rastream)
>>>>>>>>>> # ./argus-clients-3.0.7.10/bin/rastream -d -F /etc/rastream.conf -S 127.0.0.1:561 -B 15s -M time 10m -w /var/opt/argus/%Y-%m-%d/argus_%T -f /usr/local/bin/rastream.sh -D4
>>>>>>>>>> rastream[30131.c0f67cb7]: 2013-05-21 11:26:53.528909 ArgusNewQueue () returning 0x9e77550
>>>>>>>>>> rastream[30131.c0f67cb7]: 2013-05-21 11:26:53.528982 ArgusNewHashTable (65536) returning 0x9e77bc8
>>>>>>>>>> rastream[30131.c0f67cb7]: 2013-05-21 11:26:53.529008 ArgusNewList () returning 0x9e78018
>>>>>>>>>> rastream[30131.c0f67cb7]: 2013-05-21 11:26:53.529072 ArgusNewQueue () returning 0x9e7f880
>>>>>>>>>> rastream[30131]: 2013-05-21 11:26:53.529102 started
>>>>>>>>>>  
>>>>>>>>>> syslog reports:
>>>>>>>>>> May 21 11:26:53 ny-sentinel rastream[30131]: 2013-05-21 11:26:53.529102 started
>>>>>>>>>>  
>>>>>>>>>> …and nothing from racluster or anything else at the 10 minute mark, even though the rotation (and problem) still occur:
>>>>>>>>>>  
>>>>>>>>>> # ll --full-time /var/opt/argus/2013-05-21/ | tail -n 5
>>>>>>>>>> -rw-r--r--. 1 root  159764 2013-05-21 11:30:59.466174549 -0400 argus_11:20:00
>>>>>>>>>> -rw-r--r--. 1 root  384117 2013-05-21 11:30:17.619561000 -0400 argus_11:20:00.gz
>>>>>>>>>> -rw-r--r--. 1 root  181472 2013-05-21 11:41:00.721865476 -0400 argus_11:30:00
>>>>>>>>>> -rw-r--r--. 1 root  394783 2013-05-21 11:40:17.511247000 -0400 argus_11:30:00.gz
>>>>>>>>>> -rw-r--r--. 1 root 1421440 2013-05-21 11:48:55.328587722 -0400 argus_11:40:00
>>>>>>>>>>  
>>>>>>>>>>  
>>>>>>>>>>  
>>>>>>>>>> Reverting rastream.sh to execute racluster in the $PATH (as original) and changing back to 'time 1h' as:
>>>>>>>>>> # kill -2 $(pgrep rastream)
>>>>>>>>>> # ./argus-clients-3.0.7.10/bin/rastream -d -F /etc/rastream.conf -S 127.0.0.1:561 -B 15s -M time 1h -w /var/opt/argus/%Y-%m-%d/argus_%T -f /usr/local/bin/rastream.sh -D4
>>>>>>>>>>  
>>>>>>>>>> results in syslog:
>>>>>>>>>> May 21 12:05:37 ny-sentinel rastream[30905]: 2013-05-21 12:05:37.950988 started
>>>>>>>>>>  
>>>>>>>>>> …and nothing from racluster (why and how?)), but the problem still exists:
>>>>>>>>>>  
>>>>>>>>>> # ll --full-time /var/opt/argus/2013-05-21/ | tail -n 3
>>>>>>>>>> -rw-r--r--. 1 root  145612 2013-05-21 13:01:00.562634873 -0400 argus_12:00:00
>>>>>>>>>> -rw-r--r--. 1 root 2303564 2013-05-21 13:00:18.216161000 -0400 argus_12:00:00.gz
>>>>>>>>>> -rw-r--r--. 1 root  315520 2013-05-21 13:02:35.126692699 -0400 argus_13:00:00
>>>>>>>>>>  
>>>>>>>>>> It's quite obvious that the size of the argus data has increased a very large amount (due to the debugging).
>>>>>>>>>>  
>>>>>>>>>>  
>>>>>>>>>> Does anything here indicate the problem?
>>>>>>>>>>  
>>>>>>>>>> Is there anything else that I can do to further troubleshoot the issue?
>>>>>>>>>>  
>>>>>>>>>>  
>>>>>>>>>>  
>>>>>>>>>>  
>>>>>>>>>> Here's some freaky stuff:
>>>>>>>>>>  
>>>>>>>>>> querying the file argus_12:00:00 for ltime doesn't correlate with the last modified time of the file:
>>>>>>>>>>  
>>>>>>>>>> # ra -nr argus_12\:00\:00 -s stime ltime -w - | rasort -m stime -s stime ltime | head -n 3
>>>>>>>>>>                  StartTime                   LastTime
>>>>>>>>>> 2013-05-21 12:59:03.164195 2013-05-21 12:59:06.417385
>>>>>>>>>> 2013-05-21 12:59:03.369926 2013-05-21 12:59:03.501171
>>>>>>>>>>  
>>>>>>>>>> # ra -nr argus_12\:00\:00 -s stime ltime -w - | rasort -m ltime -s stime ltime | tail -n 2
>>>>>>>>>> 2013-05-21 12:59:17.117235 2013-05-21 13:00:00.000000
>>>>>>>>>> 2013-05-21 12:59:28.896645 2013-05-21 13:00:00.000000
>>>>>>>>>>  
>>>>>>>>>> # ra -nr argus_12\:00\:00 -s stime ltime | head -n 3
>>>>>>>>>>                  StartTime                   LastTime
>>>>>>>>>> 2013-05-21 12:05:38.036319 2013-05-21 09:52:57.532792 # this is weird.
>>>>>>>>>> 2013-05-21 12:59:27.970584 2013-05-21 12:59:36.325653
>>>>>>>>>>  
>>>>>>>>>>  
>>>>>>>>>>  
>>>>>>>>>> querying the file argus_12:00:00.gz:
>>>>>>>>>>  
>>>>>>>>>> sort by stime:
>>>>>>>>>> # ra -nr argus_12\:00\:00.gz -s stime ltime -w - | rasort -m stime -s stime ltime | head -n 3
>>>>>>>>>>                  StartTime                   LastTime
>>>>>>>>>> 2013-05-21 11:59:44.064216 2013-05-21 12:34:17.996512
>>>>>>>>>> 2013-05-21 11:59:44.509320 2013-05-21 11:59:59.639822
>>>>>>>>>>  
>>>>>>>>>> # ra -nr argus_12\:00\:00.gz -s stime ltime -w - | rasort -m stime -s stime ltime | tail -n 2
>>>>>>>>>> 2013-05-21 12:59:02.980166 2013-05-21 12:59:02.981078
>>>>>>>>>> 2013-05-21 12:59:02.980171 2013-05-21 12:59:02.981120
>>>>>>>>>>  
>>>>>>>>>>  
>>>>>>>>>> sort by ltime:
>>>>>>>>>> # ra -nr argus_12\:00\:00.gz -s ltime -w - | rasort -m ltime -s stime ltime | head -n 3
>>>>>>>>>>                  StartTime                   LastTime
>>>>>>>>>> 2013-05-21 11:59:46.214566 2013-05-21 11:59:46.314022
>>>>>>>>>> 2013-05-21 11:59:51.860594 2013-05-21 11:59:51.887852
>>>>>>>>>>  
>>>>>>>>>> # ra -nr argus_12\:00\:00.gz -s ltime -w - | rasort -m stime ltime -s stime ltime | tail -n 2
>>>>>>>>>> 2013-05-21 12:00:00.000000 2013-05-21 13:00:00.000000
>>>>>>>>>> 2013-05-21 12:38:12.914597 2013-05-21 13:00:00.000000
>>>>>>>>>>  
>>>>>>>>>>  
>>>>>>>>>>  
>>>>>>>>>>  
>>>>>>>>>> Thanks again,
>>>>>>>>>>  
>>>>>>>>>> Matt
>>>>>>>>>> 
>>>>>>>>>> On May 21, 2013, at 10:49 AM, Carter Bullard <carter at qosient.com> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Yes you have to compile in debug support.  We should turn on symbols, etc... as well.  In the clients root directory, try this:
>>>>>>>>>>> 
>>>>>>>>>>> % touch .devel .debug
>>>>>>>>>>> % ./configure
>>>>>>>>>>> % make clean; make
>>>>>>>>>>> 
>>>>>>>>>>> And use the resulting binaries for the testing.
>>>>>>>>>>> Carter
>>>>>>>>>>> 
>>>>>>>>>>> On May 21, 2013, at 10:08 AM, Matt Brown <matthewbrown at gmail.com> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> Carter,
>>>>>>>>>>>> 
>>>>>>>>>>>> Do settings need to be changed at compile time in order to get debugging features?  I have started with -D4 and only the start time and forked time are logged.  I have confirmed that rastream is appending to the proper file.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Just to let you know *latest* is still 3.0.7.9, not 3.0.7.10.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks very much!
>>>>>>>>>>>> 
>>>>>>>>>>>> Matt
>>>>>>>>>>>> 
>>>>>>>>>>>> On May 21, 2013, at 9:55 AM, Carter Bullard <carter at qosient.com> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>> Try the newest client distribution, just to see if any of the changes we
>>>>>>>>>>>>> made to date affect the problem.  This may be a tough one to debug,
>>>>>>>>>>>>> but if I can recreate the issue, then we're on our way to happiness.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> We'll need you to turn on debug on your daemon, the output should
>>>>>>>>>>>>> go to the syslog, so testing that briefly may help a bit.  Run your rastream
>>>>>>>>>>>>> with the "-d -D4" options, and see if you get any debug info in your syslog.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Carter 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On May 21, 2013, at 9:48 AM, Matt Brown <matthewbrown at gmail.com> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Good morning Carter!
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>  
>>>>>>>>>>>>>> I wrote an init script and had some issues when starting rastream with the script targetting ~/.rarc even if I set $ARGUSHOME or $HOME within the script (on CentOS) (due to service's use of /bin/env).
>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>  
>>>>>>>>>>>>>> So, the line I use within the init script is:
>>>>>>>>>>>>>>  
>>>>>>>>>>>>>> /usr/local/bin/rastream -d -F /etc/rastream.conf -S 127.0.0.1:561 -B 15s -M time 1h -w /var/opt/argus/%Y-%m-%d/argus_%T -f /usr/local/bin/rastream.sh
>>>>>>>>>>>>>>  
>>>>>>>>>>>>>> The same problem with timing occurs as had previously occur when relying on the full ~/.rarc.
>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>  
>>>>>>>>>>>>>> Here is an etherpad with relevant information:https://etherpad.mozilla.org/dmoSdfQ9H4
>>>>>>>>>>>>>>  
>>>>>>>>>>>>>> # rastream --version
>>>>>>>>>>>>>> Rastream Version 3.0.7.9
>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>  
>>>>>>>>>>>>>> Thanks very much Carter!
>>>>>>>>>>>>>>  
>>>>>>>>>>>>>> Matt
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On May 21, 2013, at 6:44 AM, Carter Bullard <carter at qosient.com> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Hey Matt,
>>>>>>>>>>>>>>> So.......any chance you are setting the timezine in a rarc file somewhere, like /etc/rarc or in your yme directory ?
>>>>>>>>>>>>>>> What version are you running ?
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Carter
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On May 20, 2013, at 2:42 PM, Matt Brown <matthewbrown at gmail.com> wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Hello All:
>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>> I am having a problem with rastream that's manifested itself when using the -f "shell script executor" argument to rotate files at 'time 1h'.
>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>> If I run rastream as a daemon, then the script seems to run before the "hour" is over (and the "hour" is over at the incorrect time):
>>>>>>>>>>>>>>>> # rastream -d -S 127.0.0.1:561 -B 15s -M time 1h -w /var/opt/argus/%Y-%m-%d/argus_%T -f /usr/local/bin/rastream.sh
>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>> A few hours' files look like:
>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>> # ls --full-time /var/opt/argus/2013-05-18
>>>>>>>>>>>>>>>> total 3728
>>>>>>>>>>>>>>>> -rw-r--r--. 1 root  18752 2013-05-18 01:00:59.556839459 -0400 argus_00:00:00
>>>>>>>>>>>>>>>> -rw-r--r--. 1 root 160607 2013-05-18 01:00:17.793286000 -0400 argus_00:00:00.gz
>>>>>>>>>>>>>>>> -rw-r--r--. 1 root  12068 2013-05-18 02:00:59.619364943 -0400 argus_01:00:00
>>>>>>>>>>>>>>>> -rw-r--r--. 1 root 163409 2013-05-18 02:00:17.943700000 -0400 argus_01:00:00.gz
>>>>>>>>>>>>>>>> -rw-r--r--. 1 root   9032 2013-05-1803:01:00.579907536 -0400 argus_02:00:00
>>>>>>>>>>>>>>>> -rw-r--r--. 1 root 122920 2013-05-18 03:00:17.834317000 -0400 argus_02:00:00.gz
>>>>>>>>>>>>>>>> -rw-r--r--. 1 root  22092 2013-05-18 04:01:00.698357771 -0400 argus_03:00:00
>>>>>>>>>>>>>>>> -rw-r--r--. 1 root 122002 2013-05-18 04:00:17.835675000 -0400 argus_03:00:00.gz
>>>>>>>>>>>>>>>> -rw-r--r--. 1 root  17704 2013-05-18 05:01:00.450618851 -0400 argus_04:00:00
>>>>>>>>>>>>>>>> -rw-r--r--. 1 root 133212 2013-05-18 05:00:17.742040000 -0400 argus_04:00:00.gz
>>>>>>>>>>>>>>>> -rw-r--r--. 1 root  14592 2013-05-18 06:00:54.886285774 -0400 argus_05:00:00
>>>>>>>>>>>>>>>> -rw-r--r--. 1 root 160523 2013-05-18 06:00:17.562776000 -0400 argus_05:00:00.gz
>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>> It looks like the gzipped file is last modified before the hour file, which leads me to believe that rastream isn't finished writing to the argus file before -f[] is executed.
>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>> If I run rastream as follows, I have no problem:
>>>>>>>>>>>>>>>> # nohup rastream -S 127.0.0.1:561 -B 15s -M time 1h -w /var/opt/argus/%Y-%m-%d/argus_%T -f /usr/local/bin/rastream.sh &
>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>> A few hours' files look like:
>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>> #  ls --full-time /var/opt/argus/2013-05-20
>>>>>>>>>>>>>>>> total 5372
>>>>>>>>>>>>>>>> -rw-r--r--. 1 root 217245 2013-05-20 10:00:17.573908000 -0400 argus_09:00:00.gz
>>>>>>>>>>>>>>>> -rw-r--r--. 1 root   6377 2013-05-2011:00:17.762140000 -0400 argus_10:00:00.gz
>>>>>>>>>>>>>>>> -rw-r--r--. 1 root   9269 2013-05-2011:38:08.879810000 -0400 argus_11:00:00.gz
>>>>>>>>>>>>>>>> -rw-r--r--. 1 root    313 2013-05-2013:00:17.170958000 -0400 argus_12:00:00.gz
>>>>>>>>>>>>>>>> -rw-r--r--. 1 root   8965 2013-05-2014:00:17.540889000 -0400 argus_13:00:00.gz
>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>> (early day over there)
>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>> I have verified that the system time is correct and ntpd is running properly.
>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>> Clients are 3.0.7.9.
>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>> Carter, do you have any ideas?
>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>> Matt
>>>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20130603/fff11891/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 6837 bytes
Desc: not available
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20130603/fff11891/attachment.bin>


More information about the argus mailing list