rasplit crash

Jesse Bowling jessebowling at gmail.com
Wed Mar 26 08:35:38 EDT 2014


Here's the rasplit invocation I use:

/usr/local/bin/rasplit -M time 5m -S 127.0.0.1:561 -w
/nsm/argus/data/\$srcid/%Y/%m/%d/argus.%Y.%m.%d.%H.%M.%S -d

There are only a few changes in recent time that I can remember for this
host:

* Upgraded argus and argus-clients to the latest (3.0.7.5 and 3.0.7.19)
from 3.0.7.2 and 3.0.7.9
* Changed the rasplit from "-S publicIP" to "-S 127.0.0.1"
* Added the "-d" flag (previously just using &)

Thinking that perhaps something would jump out at you from the startup, I
recompiled with .debug and captured the following...If there's anything
else I can do to help please let me know...

Cheers,

Jesse

# /usr/local/bin/rasplit -D 8 -M time 5m -S 127.0.0.1:561 -w /tmp/a
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.494934 ArgusCalloc (1,
16) returning 0x1e17450
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.494967 ArgusAddModeList
(time) returning 1
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.494976 ArgusCalloc (1,
16) returning 0x1e17410
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.494984 ArgusAddModeList
(5m) returning 1
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.495057 ArgusCalloc (1,
461728) returning 0x5df8a010
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.495069 ArgusAddHostList
(0x5dffb010, 127.0.0.1:561, 1, 6) returning 1
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.495082 ArgusCalloc (1,
144) returning 0x1e167b0
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.495092 ArgusNewList ()
returning 0x1e167b0
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.495101 ArgusCalloc (1,
296) returning 0x1e17260
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.495108
ArgusPushFrontList (0x1e167b0, 0x1e17260, 1) returning 0xc685
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.495147 ArgusCalloc (1,
296) returning 0x1e17620
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.495155
ArgusPushFrontList (0x1e167b0, 0x1e17620, 1) returning 0xc685
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.495163 ArgusClientInit()
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.495170 main: reading
files completed
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.495177 ArgusCalloc (1,
72) returning 0x1e16870
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.495184 ArgusNewQueue ()
returning 0x1e16870
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.495211 Trying 127.0.0.1
port 561 Expecting Argus records
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.495287 connected
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.495295
ArgusGetServerSocket (0x7f8a5df8a010) returning 3
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.508947
ArgusReadConnection() read 16 bytes
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.508988 ArgusCalloc (1,
4194304) returning 0x5cbef010
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.509001 ArgusCalloc (1,
262144) returning 0x5df49010
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.509362
ArgusParseInit(0x7f8a5dffb010 0x7f8a5df8a010
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.509376
ArgusWriteConnection: write(3, 0x21b372a0, 7)
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.509401
ArgusWriteConnection(0x5df8a010, 0x21b372a0, 7) returning 7
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.509410
ArgusReadConnection(0x5df8a010, 2) returning 1
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.509435 RaProcessRecord
(0x5df8a630) done
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.509442 RaScheduleRecord
(0x7f8a5dffb010, 0x7f8a5df8a630) scheduled
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.509451 ArgusHandleDatum
(0x7f8a5df8a228, 0x7f8a5e07c7a8) returning 128
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.509463 ArgusFree
(0x1e16870)
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.509470 ArgusDeleteQueue
(0x1e16870) returning
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.509479
ArgusReadStream(0x7f8a5dffb010) starting
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.553196
ArgusReadStreamSocket (0x7f8a5df8a010) read 228 bytes
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.553265 ArgusCalloc (1,
384) returning 0x1e17750
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.553282 ArgusCalloc (1,
12) returning 0x1e173f0
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.553292 ArgusCalloc (1,
80) returning 0x1e16870
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.553302 ArgusCalloc (1,
36) returning 0x1e178e0
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.553312 ArgusCalloc (1,
52) returning 0x1e17910
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.553322 ArgusCalloc (1,
80) returning 0x1e17950
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.553330 ArgusCalloc (1,
120) returning 0x1e179b0
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.553338 ArgusCalloc (1,
20) returning 0x1e17a30
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.553345 ArgusCalloc (1,
28) returning 0x1e17a50
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.553352 ArgusCalloc (1,
12) returning 0x1e17a80
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.553360 ArgusAlignRecord
() returning 0x1e17750
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.553386
RaProcessSplitOptions(/tmp/a.2014.03.26.08.30.00, 4096, 0x1e17750): returns
0
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.553414
ArgusInitNewFilename(0x5dffb010, 0x1e17620, /tmp/a.2014.03.26.08.30.00) done
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.553437
ArgusGenerateRecord (0x1e17750, 0) len 308
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.553478
ArgusWriteNewLogfile (/tmp/a.2014.03.26.08.30.00, 0x21b270d0) fwrite 308
bytes
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.553486
ArgusWriteNewLogfile (/tmp/a.2014.03.26.08.30.00, 0x21b270d0) returning 0
rasplit[50821.00370a5e8a7f0000]: 03/26/14 08:33:42.553494 RaSendArgusRecord
() returning 1



On Wed, Mar 26, 2014 at 7:37 AM, Jesse Bowling <jessebowling at gmail.com>wrote:

> Hi Carter,
>
> I'm connecting directly to argus, but I'll have to verify later whether
> it's via the public address or the local host address...
>
> Cheers,
>
> Jesse
>
> On Mar 26, 2014, at 12:14 AM, Carter Bullard <carter at qosient.com> wrote:
>
> we'll use the reliable connection strategy if we're connecting to the
> localhost, so that's not the problem.  just need to figure out how your
> looping through the reliable connection logic if the connection is good.
> are you connecting to radium or argus ???
>
> carter
>
> On Mar 25, 2014, at 11:34 PM, Jesse Bowling <jessebowling at gmail.com>
> wrote:
>
> In this case I'm actually using rasplit to connect to a local instance, so
> it seems unlikely...Is this a less-than-optimal configuration? My thought
> process was that using an rasplit process to write files locally would use
> fewer resources on the argus side than having argus write files directly.
>
> If the setup is ok, then perhaps the error message is a red herring and we
> need more debug information. If so, what level of debug might provide the
> best information?
>
> Cheers,
>
> Jesse
>
>
> On Tue, Mar 25, 2014 at 11:27 PM, Carter Bullard <carter at qosient.com>wrote:
>
>> Hey Jesse,
>> So, if we believe the error messages, your spawning too many threads to
>> try to reestablish a connection to one of your remote argus data sources.
>> Does that sound plausible ??
>>
>> Carter
>>
>> On Mar 25, 2014, at 8:50 AM, Jesse Bowling <jessebowling at gmail.com>
>> wrote:
>>
>> I'll bump this, as the issue occurred again last night. It would seem
>> that whatever the problem is, it doesn't occur too often (10 days between
>> runs), but it's certainly troubling. Again the error message was:
>>
>> Mar 25 03:38:45 host rasplit[16519]: 03/25/14 03:38:45.687977 main:
>> pthread_create ArgusConnectRemotes: EAGAIN
>>
>> Any help on tracking this down? Do I need to run with debug for a while?
>>
>> Cheers,
>>
>> Jesse
>>
>>
>> On Sat, Mar 15, 2014 at 7:19 PM, Jesse Bowling <jessebowling at gmail.com>wrote:
>>
>>> Hello,
>>>
>>> I had an rasplit process crash on me, and the only indication in the
>>> logs was:
>>>
>>> Mar 15 16:26:10 rasplit[2987]: 03/15/14 16:26:10.939859 main:
>>> pthread_create ArgusConnectRemotes: EAGAIN
>>>
>>> Any hints on troubleshooting this?
>>>
>>> Thanks,
>>>
>>> Jesse
>>>
>>> --
>>> Jesse Bowling
>>>
>>>
>>
>>
>> --
>> Jesse Bowling
>>
>>
>>
>
>
> --
> Jesse Bowling
>
>


-- 
Jesse Bowling
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20140326/94699e7a/attachment.html>


More information about the argus mailing list