Rewrite of rastream
Carter Bullard
carter at qosient.com
Tue Jun 4 13:43:23 EDT 2013
Sounds like some progress. That's good.
Not sure that I know what you mean by "the file with excess data is still generate".
What do you expect to happen, and what is it doing?
Carter
On Jun 4, 2013, at 1:34 PM, Matt Brown <matthewbrown at gmail.com> wrote:
> Carter,
>
> -d no longer causes the script to execute every second. However the file with excess data is still generate when the script is executed. racluster has not yet logged a segfault in syslog.
>
>
> Thanks,
>
> Matt
>
> On Jun 3, 2013, at 11:17 AM, Carter Bullard <carter at qosient.com> wrote:
>
>> Hey Matt,
>> Don't forget to give the new rastream a run. I have had a stability issue with
>> this new rastream(), on one of my systems, so please give this one a try, and
>> yell back if you get what you want, or not !!!! I'll want to release argus-clients-3.0.8
>> soon after this is done.
>>
>> Hope all is most excellent,
>>
>> Carter
>>
>> On May 31, 2013, at 2:07 AM, Carter Bullard <carter at qosient.com> wrote:
>>
>>> Hey Matt,
>>> OK, so here is a complete rewrite of rastream() that should fix your problem,
>>> but if it doesn't holler. Just replace the file ./examples/rastream/rastream.c
>>> with the included file and recompile.
>>>
>>> There might be a minor memory leak, but I'll do that testing tomorrow.
>>>
>>> Carter
>>>
>>> <rastream.c>
>>>
>>>
>>> On May 30, 2013, at 4:15 PM, Carter Bullard <carter at qosient.com> wrote:
>>>
>>>> Well, that is a new twist, and will explain all your problems, running
>>>> the script every second, …, that isn't correct. rastream() doesn't do this on
>>>> my machine, but I'll have to pay attention to test that …...
>>>>
>>>> Just so you will know, when we, or anyone for that matter, does the
>>>> UNIX daemon thing, they must change their directory to a known
>>>> place in the filesystem, which traditionally is "/", so the " -d " operation
>>>> always has an effect on the environment, and that is about the only
>>>> thing that is different.
>>>>
>>>> Carter
>>>>
>>>>
>>>>
>>>> On May 30, 2013, at 3:53 PM, Matt Brown <matthewbrown at gmail.com> wrote:
>>>>
>>>>> Hello Carter,
>>>>>
>>>>> Thanks for persistently looking into this issue.
>>>>>
>>>>> I'm happy to a guinea pig for the new version, just advise where I can locate the test version.
>>>>>
>>>>> Unforunately, I seriously just started using argus four weeks ago (after my short stint with it last year), so I can't make any serious recommendations at this time. Although, having some functionality of ralabel would be nice (like cco)? But maybe that's the point of client separation.
>>>>>
>>>>>
>>>>> I hate to be so stubborn :) but I don't think there's anything wrong with the environment.
>>>>>
>>>>> The contents of the script are that of ../clients/support/Config/rastream.sh No more, no less.
>>>>>
>>>>> It _is_ worth noting (again) that this does only happen when I used the daemon forking mechanism (-d) to start rastream.
>>>>>
>>>>> It sounds like you hit whatever nail on whatever head, and I look forward to testing the new version!
>>>>>
>>>>>
>>>>> So, fine! Even though I don't believe it's the environment, I've added the following lines to rastream.sh befow the `racluster -M replace -r $FILES`:
>>>>>
>>>>> echo $(date) " which racluster: " $(which racluster) &>> /var/log/rastreamtest.log
>>>>> echo $(date) " \$FILES: " $FILES &>> /var/log/rastreamtest.log
>>>>>
>>>>>
>>>>> What I see mildly astounds me:
>>>>>
>>>>> /usr/local/bin/rastream -d -S 127.0.0.1:561 -B 15s -M time 1m -w /var/opt/argus/%Y-%m-%d/argus_%T -f /usr/local/bin/rastream.sh -F /etc/rastream.conf
>>>>>
>>>>> There is a log entries in /var/log/rastreamtest.log for every single second.
>>>>>
>>>>> Thu May 30 15:43:21 EDT 2013 which racluster: /usr/local/bin/racluster
>>>>> Thu May 30 15:43:21 EDT 2013 $FILES: /var/opt/argus/2013-05-30/argus_15:42:00
>>>>> gzip: /var/opt/argus/2013-05-30/argus_15:42:00.gz already exists; not overwritten
>>>>> Thu May 30 15:43:22 EDT 2013 which racluster: /usr/local/bin/racluster
>>>>> Thu May 30 15:43:22 EDT 2013 $FILES: /var/opt/argus/2013-05-30/argus_15:42:00
>>>>> gzip: /var/opt/argus/2013-05-30/argus_15:42:00.gz already exists; not overwritten
>>>>>
>>>>>
>>>>>
>>>>> nohup /usr/local/bin/rastream -S 127.0.0.1:561 -B 15s -M time 1m -w /var/opt/argus/%Y-%m-%d/argus_%T -f /usr/local/bin/rastream.sh -F /etc/rastream.conf &
>>>>>
>>>>> There is a log entry as expected, once a minute:
>>>>> Thu May 30 15:47:17 EDT 2013 which racluster: /usr/local/bin/racluster
>>>>> Thu May 30 15:47:17 EDT 2013 $FILES: /var/opt/argus/2013-05-30/argus_15:46:00
>>>>> Thu May 30 15:48:17 EDT 2013 which racluster: /usr/local/bin/racluster
>>>>> Thu May 30 15:48:17 EDT 2013 $FILES: /var/opt/argus/2013-05-30/argus_15:47:00
>>>>>
>>>>>
>>>>> So regardless, the -d logic and the -f logic do not like each other one bit.
>>>>>
>>>>> Let me know what you think, and I'll await the new version to test!
>>>>>
>>>>>
>>>>> Thanks!
>>>>>
>>>>> Matt
>>>>>
>>>>> On May 30, 2013, at 11:33 AM, Carter Bullard <carter at qosient.com> wrote:
>>>>>
>>>>>> Hey Matt,
>>>>>> OK, so I've found a pretty bad bug in rastream() and rasqlinsert(), based on
>>>>>> your descriptions, but I'm sorry that what I've found is not supportive of
>>>>>> your thesis. This problem needs to be fixed, but existing implementations
>>>>>> can continue if you are not experiencing problems.
>>>>>>
>>>>>> OK, the bug is centered around the core tech in rastream(), which was
>>>>>> modeled after rabins(), rather than rasplit(). rabins() has been reworked,
>>>>>> and its working very well, but rastream() is a little behind those changes.
>>>>>> What I'm seeing is flows not being split correctly at the end of a time
>>>>>> period, so that a days data can encroach into the next day by a few
>>>>>> seconds. This doesn't work for me, at all, so I need to work this out.
>>>>>>
>>>>>> As a result, I am doing a complete re-write to get rastream() back to
>>>>>> its original design, and should be done Fri/Mon. I suspect that it will
>>>>>> solve any problems you are having, so are you, will you be willing to
>>>>>> be a guinea pig for this?
>>>>>>
>>>>>> Since I'm rewriting it based on rasplit(), now would be the time to put
>>>>>> some new feature requests if there are any, and any discussions as
>>>>>> to what sucks and what works in rastream() would be great !!!!!
>>>>>>
>>>>>> We still need to figure out what is wrong with your environment,
>>>>>> as I think that is at the root of some of what you're seeing. You don't have
>>>>>> any options that are causing your rastream() to chroot(), like a
>>>>>> "-c" option?
>>>>>>
>>>>>> Carter
>>>>>>
>>>>>>
>>>>>> On May 29, 2013, at 3:23 PM, Carter Bullard <carter at qosient.com> wrote:
>>>>>>
>>>>>>> Hey Matt,
>>>>>>> Hmmmmmmmm, well, we need to get on the same page. Based on the last
>>>>>>> set of information, I don't think that you are running the correct code, as you
>>>>>>> are reporting segfault information without symbols, which means you're running
>>>>>>> the wrong set of software. If you are running the wrong software there, not
>>>>>>> sure that you are running the right software anywhere. I can give racluster()
>>>>>>> bad filenames, files without permissions, etc…. and it doesn't segfault…..
>>>>>>>
>>>>>>> So we have to clean things up, then we can hypothesis what the real problem is.
>>>>>>>
>>>>>>> Now, I have been digging into rastream() and rasqlinsert() code and I've found
>>>>>>> an issue that may be related, but I need to have some certainty with regard to
>>>>>>> your setup, before I take the leap to say that I have a strategy…...
>>>>>>>
>>>>>>> So, we need to understand what is going on with your rastream() script.
>>>>>>> One thing to do is to simplify the script, have it do some extensive logging
>>>>>>> to a file in /tmp, or /var/tmp or somewhere you like, so we can see what its
>>>>>>> doing…… Like printing out its environment, what it thinks is its root directory,
>>>>>>> its path, etc…...
>>>>>>>
>>>>>>> Carter
>>>>>>>
>>>>>>>
>>>>>>> On May 29, 2013, at 1:43 PM, Matt Brown <matthewbrown at gmail.com> wrote:
>>>>>>>
>>>>>>>> Hello carter,
>>>>>>>>
>>>>>>>> Thanks for replying.
>>>>>>>>
>>>>>>>> Rastream and racluster are 3.0.7.10.
>>>>>>>>
>>>>>>>>
>>>>>>>> Are you able to reproduce the issue when using the -d switch with rastream? I can reproduce it on other systems.
>>>>>>>>
>>>>>>>> What do you think of my theory?
>>>>>>>>
>>>>>>>>
>>>>>>>> Hope this helps,
>>>>>>>>
>>>>>>>> Matt
>>>>>>>>
>>>>>>>>
>>>>>>>> On May 29, 2013, at 1:34 PM, Carter Bullard <carter at qosient.com> wrote:
>>>>>>>>
>>>>>>>>> Hey Matt,
>>>>>>>>> So where are we on this? Its clear that you are not running the version of code
>>>>>>>>> that you think you are running, so we need to clear that up before proceeding
>>>>>>>>> any further ?
>>>>>>>>>
>>>>>>>>> Carter
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On May 22, 2013, at 4:07 PM, Matt Brown <matthewbrown at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Sorry about that, I was trying to be verbose in case you spotted something interesting that indicated a clear problem.
>>>>>>>>>>
>>>>>>>>>> Thanks for getting back to me.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 1) racluster segfaulting:
>>>>>>>>>>
>>>>>>>>>> racluster is executed via the script assigned by the -f parameter of the rastream instance (/usr/local/bin/rastream.sh).
>>>>>>>>>>
>>>>>>>>>> When racluster segfaults, its reporting the following to syslog:
>>>>>>>>>> May 22 01:00:18 ny-sentinel kernel: racluster[5558]: segfault at 46 ip 002e1877 sp bfb54e10 error 4 in libc-2.12.so[281000+190000]
>>>>>>>>>> May 22 02:00:18 ny-sentinel racluster[6044]: 02:00:18.586910 open '/var/opt/argus/2013-05-22/argus_01:00:00': No such file or directory
>>>>>>>>>> May 22 02:00:18 ny-sentinel kernel: racluster[6044]: segfault at 46 ip 0020d877 sp bf854d50 error 4 in libc-2.12.so[1ad000+190000]
>>>>>>>>>> May 22 03:00:18 ny-sentinel racluster[6520]: 03:00:18.534326 open '/var/opt/argus/2013-05-22/argus_02:00:00': No such file or directory
>>>>>>>>>> May 22 03:00:18 ny-sentinel kernel: racluster[6520]: segfault at 46 ip 0019c877 sp bfaf2b30 error 4 in libc-2.12.so[13c000+190000]
>>>>>>>>>> May 22 06:00:18 ny-sentinel racluster[8015]: 06:00:18.786627 open '/var/opt/argus/2013-05-22/argus_05:00:00': No such file or directory
>>>>>>>>>> May 22 06:00:18 ny-sentinel kernel: racluster[8015]: segfault at 46 ip 002a2877 sp bfa74270 error 4 in libc-2.12.so[242000+190000]
>>>>>>>>>> May 22 07:00:18 ny-sentinel racluster[8517]: 07:00:18.386497 open '/var/opt/argus/2013-05-22/argus_06:00:00': No such file or directory
>>>>>>>>>> May 22 07:00:18 ny-sentinel kernel: racluster[8517]: segfault at 46 ip 0073c877 sp bfaeee10 error 4 in libc-2.12.so[6dc000+190000]
>>>>>>>>>> May 22 08:00:18 ny-sentinel racluster[9006]: 08:00:18.591348 open '/var/opt/argus/2013-05-22/argus_07:00:00': No such file or directory
>>>>>>>>>> May 22 08:00:18 ny-sentinel kernel: racluster[9006]: segfault at 46 ip 0023e877 sp bfe6f110 error 4 in libc-2.12.so[1de000+190000]
>>>>>>>>>> May 22 10:00:19 ny-sentinel racluster[10009]: 10:00:19.991831 open '/var/opt/argus/2013-05-22/argus_09:00:00': No such file or directory
>>>>>>>>>> May 22 10:00:19 ny-sentinel kernel: racluster[10009]: segfault at 46 ip 00627877sp bfd4cf50 error 4 in libc-2.12.so[5c7000+190000]
>>>>>>>>>> May 22 13:00:20 ny-sentinel racluster[11556]: 13:00:20.585851 open '/var/opt/argus/2013-05-22/argus_12:00:00': No such file or directory
>>>>>>>>>> May 22 13:00:20 ny-sentinel kernel: racluster[11556]: segfault at 46 ip 002bb877 sp bfda4480 error 4 in libc-2.12.so[25b000+190000]
>>>>>>>>>> May 22 14:00:20 ny-sentinel racluster[12106]: 14:00:20.585760 open '/var/opt/argus/2013-05-22/argus_13:00:00': No such file or directory
>>>>>>>>>> May 22 14:00:20 ny-sentinel kernel: racluster[12106]: segfault at 46 ip 00c3b877 sp bfbebb40 error 4 in libc-2.12.so[bdb000+190000]
>>>>>>>>>> May 22 14:00:20 ny-sentinel racluster[12109]: 14:00:20.997232 open '/var/opt/argus/2013-05-22/argus_13:00:00': No such file or directory
>>>>>>>>>> May 22 14:00:20 ny-sentinel kernel: racluster[12109]: segfault at 46 ip 001ba877 sp bf838800 error 4 in libc-2.12.so[15a000+190000]
>>>>>>>>>> May 22 15:00:19 ny-sentinel racluster[12619]: 15:00:19.898971 open '/var/opt/argus/2013-05-22/argus_14:00:00': No such file or directory
>>>>>>>>>> May 22 15:00:19 ny-sentinel kernel: racluster[12619]: segfault at 46 ip 001d4877 sp bf865ab0 error 4 in libc-2.12.so[174000+190000]
>>>>>>>>>>
>>>>>>>>>> Note that there is no easily identifiable pattern in which files cause the segfault and which do not, but the problem occurs with every file.
>>>>>>>>>>
>>>>>>>>>> I can only guess that it is segfaulting because it can't open '/var/opt/argus/N/argus_N' (No such file or directory).
>>>>>>>>>> It's worth noting that in the instance mentioned in the previous email, '/var/opt/argus/2013-05-21/argus_10:00:00' did exist at the time racluster is executed. I am concluding this because after racluster is executed, gzip compresses '/var/opt/argus/2013-05-21/argus_10:00:00' successfully.
>>>>>>>>>>
>>>>>>>>>> I don't believe the segfault is the problem.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 2) Can I run racluster against the file and get good results?
>>>>>>>>>>
>>>>>>>>>> Yes, I can run racluster against both the gzipped resultant file, and the file that is generated by rastream that wasn't gzipped.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 3) Multiple client versions:
>>>>>>>>>>
>>>>>>>>>> When I mention modifying which racluster is executed by the shell script (executed by rastream's -f argument), I simply was attempting to have it use the racluster binary that I had compiled with support for debugging and symbols. My reasoning was so that it would produce more verbose output to syslog, which it didn't.
>>>>>>>>>>
>>>>>>>>>> There are only two sets of binaries located on the system. One that has been compiled with debugging support and one that hasn't:
>>>>>>>>>> # find / -iname racluster
>>>>>>>>>> /root/argus-clients-3.0.7.10/bin/racluster #debugging support
>>>>>>>>>> /usr/local/bin/racluster #no debugging support
>>>>>>>>>>
>>>>>>>>>> /usr/local/bin/rastream is executed and is writing the files I am concerned with (and is causing the problem).
>>>>>>>>>> /usr/local/bin/racluster is currently executed by /usr/local/bin/rastream.sh due to its location in the $PATH (and is the one that is segfaulting occasionally).
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 4) How I see the problem:
>>>>>>>>>>
>>>>>>>>>> I don't want to make any assumptions, or create a red herring, since I am not familiar with the code of rastream, but I believe that the following is happening:
>>>>>>>>>>
>>>>>>>>>> 1) `/usr/local/bin/rastream -d -S 127.0.0.1:561 -B 15s -M time 1h -w /var/opt/argus/%Y-%m-%d/argus_%T -f /usr/local/bin/rastream.sh` is executed, and then (due to -d) the daemonizing method is called and forks another rastream process and starts writing to /var/opt/argus/Y-m-d/argus_T1. (which is good)
>>>>>>>>>> 2) At `time 1h`, `/usr/local/bin/rastream.sh` is forked, without blocking `rastream` from writing. (good)
>>>>>>>>>> 3) While #2 occurs, rastream is _still_ writing to /var/opt/argus/Y-m-d/argus_T1 (the file from the previous `time 1h` time span, not the new `time 1h` time span). (which is bad)
>>>>>>>>>> 4) `/usr/local/bin/rastream.sh` executes `racluster -M replace -r $FILES` against the file /var/opt/argus/Y-m-d/argus_T1. (bad)
>>>>>>>>>> 5) `/usr/local/bin/rastream.sh` executes `gzip $FILES` and is able to add /var/opt/argus/Y-m-d/argus_T1 (bad)
>>>>>>>>>> 6) `rastream` begins writing /var/opt/argus/Y-m-d/argus_T2 (the new `time 1h` file) (bad)
>>>>>>>>>>
>>>>>>>>>> So, the daemonized fork of rastream doesn't start writing to the new file (/var/opt/argus/Y-m-d/argus_T2) until _after_ it executes the -f designated script `/usr/local/bin/rastream.sh`.
>>>>>>>>>> Instead, what should be happening is: the daemonized fork of rastream should start writing to the new file, and _then_ execute the -f designated script `/usr/local/bin/rastream.sh`.
>>>>>>>>>>
>>>>>>>>>> It would seem that the way that rastream is daemonized affects the later -f logic.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Hope this helps.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks for your time,
>>>>>>>>>>
>>>>>>>>>> Matt
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On May 22, 2013, at 10:45 AM, Carter Bullard <carter at qosient.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hey Matt,
>>>>>>>>>>> Hmmm, so your emails are waaayyyyy too dense for dealing with bugs / problems.
>>>>>>>>>>> Lets focus on one thing at a time.
>>>>>>>>>>>
>>>>>>>>>>> Your racluster() is segfaulting. Why is it doing that? Can you run racluster()
>>>>>>>>>>> against the file and get good results ? Do you have multiple versions of the
>>>>>>>>>>> clients installed on your machine, so that there is ambiguity in which racluster()
>>>>>>>>>>> or rastream() is actually running? We should clear that up before proceeding
>>>>>>>>>>> any further.
>>>>>>>>>>>
>>>>>>>>>>> Carter
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On May 21, 2013, at 1:31 PM, Matt Brown <matthewbrown at gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hello Carter,
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for getting back to me quickly.
>>>>>>>>>>>>
>>>>>>>>>>>> I've tried the following to attempt to troubleshoot the problem, but it doesn't appear that anything's outstanding.
>>>>>>>>>>>>
>>>>>>>>>>>> Apologies for the verbosity…
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> # cd ./argus-clients-3.0.7.10
>>>>>>>>>>>> # touch .devel .debug
>>>>>>>>>>>> # ./configure
>>>>>>>>>>>> # make clean && make
>>>>>>>>>>>> # kill -2 $(pgrep rastream)
>>>>>>>>>>>> # ./bin/rastream -d -F /etc/rastream.conf -S 127.0.0.1:561 -B 15s -M time 1h -w /var/opt/argus/%Y-%m-%d/argus_%T -f /usr/local/bin/rastream.sh -D4
>>>>>>>>>>>> rastream[29500.c0567db7]: 2013-05-21 10:54:31.455541 ArgusNewQueue () returning 0x8609548
>>>>>>>>>>>> rastream[29500.c0567db7]: 2013-05-21 10:54:31.455612 ArgusNewHashTable (65536) returning 0x8609bc0
>>>>>>>>>>>> rastream[29500.c0567db7]: 2013-05-21 10:54:31.455632 ArgusNewList () returning 0x860a010
>>>>>>>>>>>> rastream[29500.c0567db7]: 2013-05-21 10:54:31.455695 ArgusNewQueue () returning 0x8611878
>>>>>>>>>>>> rastream[29500]: 2013-05-21 10:54:31.455729 started
>>>>>>>>>>>>
>>>>>>>>>>>> syslog reports:
>>>>>>>>>>>> May 21 10:54:31 ny-sentinel rastream[29500]: 2013-05-21 10:54:31.455729 started
>>>>>>>>>>>> May 21 11:00:18 ny-sentinel racluster[29544]: 2013-05-21 11:00:18.635754 open '/var/opt/argus/2013-05-21/argus_10:00:00': No such file or directory
>>>>>>>>>>>> May 21 11:00:18 ny-sentinel kernel: racluster[29544]: segfault at 46 ip 001d9877 sp bfb9f200 error 4 in libc-2.12.so[179000+190000]
>>>>>>>>>>>>
>>>>>>>>>>>> With /usr/local/bin/rastream.sh executing `racluster -M replace -r $FILES`, using the racluster in $PATH (as it is originally).
>>>>>>>>>>>>
>>>>>>>>>>>> Note that '/var/opt/argus/2013-05-21/argus_10:00:00' did exist at the time [but that doesn't mean that racluster can read it ?].
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> This is interesting:
>>>>>>>>>>>> I modifed /usr/local/bin/rastream.sh to execute `[path]/argus-clients-3.0.7.10/bin/racluster -D4 -M replace -r $FILES` instead of just using the $PATHed racluster…
>>>>>>>>>>>>
>>>>>>>>>>>> # kill -2 $(pgrep rastream)
>>>>>>>>>>>> # ./argus-clients-3.0.7.10/bin/rastream -d -F /etc/rastream.conf -S 127.0.0.1:561 -B 15s -M time 10m -w /var/opt/argus/%Y-%m-%d/argus_%T -f /usr/local/bin/rastream.sh -D4
>>>>>>>>>>>> rastream[30131.c0f67cb7]: 2013-05-21 11:26:53.528909 ArgusNewQueue () returning 0x9e77550
>>>>>>>>>>>> rastream[30131.c0f67cb7]: 2013-05-21 11:26:53.528982 ArgusNewHashTable (65536) returning 0x9e77bc8
>>>>>>>>>>>> rastream[30131.c0f67cb7]: 2013-05-21 11:26:53.529008 ArgusNewList () returning 0x9e78018
>>>>>>>>>>>> rastream[30131.c0f67cb7]: 2013-05-21 11:26:53.529072 ArgusNewQueue () returning 0x9e7f880
>>>>>>>>>>>> rastream[30131]: 2013-05-21 11:26:53.529102 started
>>>>>>>>>>>>
>>>>>>>>>>>> syslog reports:
>>>>>>>>>>>> May 21 11:26:53 ny-sentinel rastream[30131]: 2013-05-21 11:26:53.529102 started
>>>>>>>>>>>>
>>>>>>>>>>>> …and nothing from racluster or anything else at the 10 minute mark, even though the rotation (and problem) still occur:
>>>>>>>>>>>>
>>>>>>>>>>>> # ll --full-time /var/opt/argus/2013-05-21/ | tail -n 5
>>>>>>>>>>>> -rw-r--r--. 1 root 159764 2013-05-21 11:30:59.466174549 -0400 argus_11:20:00
>>>>>>>>>>>> -rw-r--r--. 1 root 384117 2013-05-21 11:30:17.619561000 -0400 argus_11:20:00.gz
>>>>>>>>>>>> -rw-r--r--. 1 root 181472 2013-05-21 11:41:00.721865476 -0400 argus_11:30:00
>>>>>>>>>>>> -rw-r--r--. 1 root 394783 2013-05-21 11:40:17.511247000 -0400 argus_11:30:00.gz
>>>>>>>>>>>> -rw-r--r--. 1 root 1421440 2013-05-21 11:48:55.328587722 -0400 argus_11:40:00
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Reverting rastream.sh to execute racluster in the $PATH (as original) and changing back to 'time 1h' as:
>>>>>>>>>>>> # kill -2 $(pgrep rastream)
>>>>>>>>>>>> # ./argus-clients-3.0.7.10/bin/rastream -d -F /etc/rastream.conf -S 127.0.0.1:561 -B 15s -M time 1h -w /var/opt/argus/%Y-%m-%d/argus_%T -f /usr/local/bin/rastream.sh -D4
>>>>>>>>>>>>
>>>>>>>>>>>> results in syslog:
>>>>>>>>>>>> May 21 12:05:37 ny-sentinel rastream[30905]: 2013-05-21 12:05:37.950988 started
>>>>>>>>>>>>
>>>>>>>>>>>> …and nothing from racluster (why and how?)), but the problem still exists:
>>>>>>>>>>>>
>>>>>>>>>>>> # ll --full-time /var/opt/argus/2013-05-21/ | tail -n 3
>>>>>>>>>>>> -rw-r--r--. 1 root 145612 2013-05-21 13:01:00.562634873 -0400 argus_12:00:00
>>>>>>>>>>>> -rw-r--r--. 1 root 2303564 2013-05-21 13:00:18.216161000 -0400 argus_12:00:00.gz
>>>>>>>>>>>> -rw-r--r--. 1 root 315520 2013-05-21 13:02:35.126692699 -0400 argus_13:00:00
>>>>>>>>>>>>
>>>>>>>>>>>> It's quite obvious that the size of the argus data has increased a very large amount (due to the debugging).
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Does anything here indicate the problem?
>>>>>>>>>>>>
>>>>>>>>>>>> Is there anything else that I can do to further troubleshoot the issue?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Here's some freaky stuff:
>>>>>>>>>>>>
>>>>>>>>>>>> querying the file argus_12:00:00 for ltime doesn't correlate with the last modified time of the file:
>>>>>>>>>>>>
>>>>>>>>>>>> # ra -nr argus_12\:00\:00 -s stime ltime -w - | rasort -m stime -s stime ltime | head -n 3
>>>>>>>>>>>> StartTime LastTime
>>>>>>>>>>>> 2013-05-21 12:59:03.164195 2013-05-21 12:59:06.417385
>>>>>>>>>>>> 2013-05-21 12:59:03.369926 2013-05-21 12:59:03.501171
>>>>>>>>>>>>
>>>>>>>>>>>> # ra -nr argus_12\:00\:00 -s stime ltime -w - | rasort -m ltime -s stime ltime | tail -n 2
>>>>>>>>>>>> 2013-05-21 12:59:17.117235 2013-05-21 13:00:00.000000
>>>>>>>>>>>> 2013-05-21 12:59:28.896645 2013-05-21 13:00:00.000000
>>>>>>>>>>>>
>>>>>>>>>>>> # ra -nr argus_12\:00\:00 -s stime ltime | head -n 3
>>>>>>>>>>>> StartTime LastTime
>>>>>>>>>>>> 2013-05-21 12:05:38.036319 2013-05-21 09:52:57.532792 # this is weird.
>>>>>>>>>>>> 2013-05-21 12:59:27.970584 2013-05-21 12:59:36.325653
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> querying the file argus_12:00:00.gz:
>>>>>>>>>>>>
>>>>>>>>>>>> sort by stime:
>>>>>>>>>>>> # ra -nr argus_12\:00\:00.gz -s stime ltime -w - | rasort -m stime -s stime ltime | head -n 3
>>>>>>>>>>>> StartTime LastTime
>>>>>>>>>>>> 2013-05-21 11:59:44.064216 2013-05-21 12:34:17.996512
>>>>>>>>>>>> 2013-05-21 11:59:44.509320 2013-05-21 11:59:59.639822
>>>>>>>>>>>>
>>>>>>>>>>>> # ra -nr argus_12\:00\:00.gz -s stime ltime -w - | rasort -m stime -s stime ltime | tail -n 2
>>>>>>>>>>>> 2013-05-21 12:59:02.980166 2013-05-21 12:59:02.981078
>>>>>>>>>>>> 2013-05-21 12:59:02.980171 2013-05-21 12:59:02.981120
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> sort by ltime:
>>>>>>>>>>>> # ra -nr argus_12\:00\:00.gz -s ltime -w - | rasort -m ltime -s stime ltime | head -n 3
>>>>>>>>>>>> StartTime LastTime
>>>>>>>>>>>> 2013-05-21 11:59:46.214566 2013-05-21 11:59:46.314022
>>>>>>>>>>>> 2013-05-21 11:59:51.860594 2013-05-21 11:59:51.887852
>>>>>>>>>>>>
>>>>>>>>>>>> # ra -nr argus_12\:00\:00.gz -s ltime -w - | rasort -m stime ltime -s stime ltime | tail -n 2
>>>>>>>>>>>> 2013-05-21 12:00:00.000000 2013-05-21 13:00:00.000000
>>>>>>>>>>>> 2013-05-21 12:38:12.914597 2013-05-21 13:00:00.000000
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks again,
>>>>>>>>>>>>
>>>>>>>>>>>> Matt
>>>>>>>>>>>>
>>>>>>>>>>>> On May 21, 2013, at 10:49 AM, Carter Bullard <carter at qosient.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Yes you have to compile in debug support. We should turn on symbols, etc... as well. In the clients root directory, try this:
>>>>>>>>>>>>>
>>>>>>>>>>>>> % touch .devel .debug
>>>>>>>>>>>>> % ./configure
>>>>>>>>>>>>> % make clean; make
>>>>>>>>>>>>>
>>>>>>>>>>>>> And use the resulting binaries for the testing.
>>>>>>>>>>>>> Carter
>>>>>>>>>>>>>
>>>>>>>>>>>>> On May 21, 2013, at 10:08 AM, Matt Brown <matthewbrown at gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Carter,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Do settings need to be changed at compile time in order to get debugging features? I have started with -D4 and only the start time and forked time are logged. I have confirmed that rastream is appending to the proper file.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Just to let you know *latest* is still 3.0.7.9, not 3.0.7.10.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks very much!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Matt
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On May 21, 2013, at 9:55 AM, Carter Bullard <carter at qosient.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Try the newest client distribution, just to see if any of the changes we
>>>>>>>>>>>>>>> made to date affect the problem. This may be a tough one to debug,
>>>>>>>>>>>>>>> but if I can recreate the issue, then we're on our way to happiness.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> We'll need you to turn on debug on your daemon, the output should
>>>>>>>>>>>>>>> go to the syslog, so testing that briefly may help a bit. Run your rastream
>>>>>>>>>>>>>>> with the "-d -D4" options, and see if you get any debug info in your syslog.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Carter
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On May 21, 2013, at 9:48 AM, Matt Brown <matthewbrown at gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Good morning Carter!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I wrote an init script and had some issues when starting rastream with the script targetting ~/.rarc even if I set $ARGUSHOME or $HOME within the script (on CentOS) (due to service's use of /bin/env).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> So, the line I use within the init script is:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> /usr/local/bin/rastream -d -F /etc/rastream.conf -S 127.0.0.1:561 -B 15s -M time 1h -w /var/opt/argus/%Y-%m-%d/argus_%T -f /usr/local/bin/rastream.sh
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The same problem with timing occurs as had previously occur when relying on the full ~/.rarc.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Here is an etherpad with relevant information:https://etherpad.mozilla.org/dmoSdfQ9H4
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> # rastream --version
>>>>>>>>>>>>>>>> Rastream Version 3.0.7.9
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks very much Carter!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Matt
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On May 21, 2013, at 6:44 AM, Carter Bullard <carter at qosient.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hey Matt,
>>>>>>>>>>>>>>>>> So.......any chance you are setting the timezine in a rarc file somewhere, like /etc/rarc or in your yme directory ?
>>>>>>>>>>>>>>>>> What version are you running ?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Carter
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On May 20, 2013, at 2:42 PM, Matt Brown <matthewbrown at gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hello All:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I am having a problem with rastream that's manifested itself when using the -f "shell script executor" argument to rotate files at 'time 1h'.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> If I run rastream as a daemon, then the script seems to run before the "hour" is over (and the "hour" is over at the incorrect time):
>>>>>>>>>>>>>>>>>> # rastream -d -S 127.0.0.1:561 -B 15s -M time 1h -w /var/opt/argus/%Y-%m-%d/argus_%T -f /usr/local/bin/rastream.sh
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> A few hours' files look like:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> # ls --full-time /var/opt/argus/2013-05-18
>>>>>>>>>>>>>>>>>> total 3728
>>>>>>>>>>>>>>>>>> -rw-r--r--. 1 root 18752 2013-05-18 01:00:59.556839459 -0400 argus_00:00:00
>>>>>>>>>>>>>>>>>> -rw-r--r--. 1 root 160607 2013-05-18 01:00:17.793286000 -0400 argus_00:00:00.gz
>>>>>>>>>>>>>>>>>> -rw-r--r--. 1 root 12068 2013-05-18 02:00:59.619364943 -0400 argus_01:00:00
>>>>>>>>>>>>>>>>>> -rw-r--r--. 1 root 163409 2013-05-18 02:00:17.943700000 -0400 argus_01:00:00.gz
>>>>>>>>>>>>>>>>>> -rw-r--r--. 1 root 9032 2013-05-1803:01:00.579907536 -0400 argus_02:00:00
>>>>>>>>>>>>>>>>>> -rw-r--r--. 1 root 122920 2013-05-18 03:00:17.834317000 -0400 argus_02:00:00.gz
>>>>>>>>>>>>>>>>>> -rw-r--r--. 1 root 22092 2013-05-18 04:01:00.698357771 -0400 argus_03:00:00
>>>>>>>>>>>>>>>>>> -rw-r--r--. 1 root 122002 2013-05-18 04:00:17.835675000 -0400 argus_03:00:00.gz
>>>>>>>>>>>>>>>>>> -rw-r--r--. 1 root 17704 2013-05-18 05:01:00.450618851 -0400 argus_04:00:00
>>>>>>>>>>>>>>>>>> -rw-r--r--. 1 root 133212 2013-05-18 05:00:17.742040000 -0400 argus_04:00:00.gz
>>>>>>>>>>>>>>>>>> -rw-r--r--. 1 root 14592 2013-05-18 06:00:54.886285774 -0400 argus_05:00:00
>>>>>>>>>>>>>>>>>> -rw-r--r--. 1 root 160523 2013-05-18 06:00:17.562776000 -0400 argus_05:00:00.gz
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> It looks like the gzipped file is last modified before the hour file, which leads me to believe that rastream isn't finished writing to the argus file before -f[] is executed.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> If I run rastream as follows, I have no problem:
>>>>>>>>>>>>>>>>>> # nohup rastream -S 127.0.0.1:561 -B 15s -M time 1h -w /var/opt/argus/%Y-%m-%d/argus_%T -f /usr/local/bin/rastream.sh &
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> A few hours' files look like:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> # ls --full-time /var/opt/argus/2013-05-20
>>>>>>>>>>>>>>>>>> total 5372
>>>>>>>>>>>>>>>>>> -rw-r--r--. 1 root 217245 2013-05-20 10:00:17.573908000 -0400 argus_09:00:00.gz
>>>>>>>>>>>>>>>>>> -rw-r--r--. 1 root 6377 2013-05-2011:00:17.762140000 -0400 argus_10:00:00.gz
>>>>>>>>>>>>>>>>>> -rw-r--r--. 1 root 9269 2013-05-2011:38:08.879810000 -0400 argus_11:00:00.gz
>>>>>>>>>>>>>>>>>> -rw-r--r--. 1 root 313 2013-05-2013:00:17.170958000 -0400 argus_12:00:00.gz
>>>>>>>>>>>>>>>>>> -rw-r--r--. 1 root 8965 2013-05-2014:00:17.540889000 -0400 argus_13:00:00.gz
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> (early day over there)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I have verified that the system time is correct and ntpd is running properly.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Clients are 3.0.7.9.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Carter, do you have any ideas?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Matt
>>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20130604/68bf698e/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 6837 bytes
Desc: not available
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20130604/68bf698e/attachment.bin>
More information about the argus
mailing list