ra split dies with a floating point divide error when processing large file

Carter Bullard carter at qosient.com
Thu Oct 18 04:36:35 EDT 2018


we have found that some VM’s disk buffering strategies could result in file corruption for some clients ... maybe an issue here ... using VMWare ???  I’m traveling and will not be back for a week or so ... will consult and try to get back ...
Carter

	 	
Carter Bullard • CTO
150 E 57th Street Suite 12D
New York, New York 10022-2795
Phone +1.212.588.9133 • Mobile +1.917.497.9494

> On Oct 16, 2018, at 10:36 PM, Russell Fulton <r.fulton at auckland.ac.nz> wrote:
> 
> another data point I tried running rasplit on the output of an other sensor file which is a bit smaller with the same result — it ran for about half an hour and then crashed.
> 
> I am copying the files off onto another machine and will try splitting them there.  I need to free up the disk anyway!
> 
>> On 17/10/2018, at 8:00 AM, Russell Fulton <r.fulton at auckland.ac.nz> wrote:
>> 
>> 
>> 
>>> On 16/10/2018, at 9:23 PM, Carter Bullard <carter at qosient.com> wrote:
>>> 
>>> Yes, I use it all the time for this purpose.  How are you splitting, size or date ??
>> 
>> time and I figured that it was intended to work for cases like this.
>> 
>> I can do this another way as I was running the process in parallel.  The one giving trouble was the “new” one so I have most of the data on the old one.  The catch is that we shut down the VM yesterday — I have logged a call to the VMware team to crank it up again so I can scp the files across.
>> 
>>> Can you send the complete commandline ??
>> 
>> 
>> rasplit -r /data/argus/data/dmzo.big -t 2018/10/04.16:00-2018/10/16  -M time 1h -w "%Y/%m/%d/dmzo.%y.%m.%d.%H.%M.%S” 
>> 
>> argus at secmgrprd02:~$ ls -lh /data/argus/data/dmzo.big 
>> -rw-r--r-- 1 argus argus 516G Oct 15 08:59 /data/argus/data/dmzo.big
>> 
>> 
>> Since the file is so big it really is not feasible to send it to you but we can at least compile it with symbols and set things up to get a dump that can be analysed.  Alternatively we can apply diagnostic patches.
>> 
>> BTW, as you probably know, I am now down to 3 days a week (Mon - Wed) and next weekend is a long weekend *and* I will be out of town for at least the next 4 days so it will be next week before I get back to this.
>> 
>> Russell
>> 
>>> Carter
>>> 
>>>        
>>> Carter Bullard • CTO
>>> 150 E 57th Street Suite 12D
>>> New York, New York 10022-2795
>>> Phone +1.212.588.9133 • Mobile +1.917.497.9494
>>> 
>>>> On Oct 16, 2018, at 6:07 AM, Russell Fulton <r.fulton at auckland.ac.nz> wrote:
>>>> 
>>>> The log rollover process broke and I was left with some large flow files so I tried to break them up with rasplit but if died about a third of the way through the file...
>>>> 
>>>> rful011 at secmgrprd02:/usr/local/tools/notify-framework$ sudo ls -lh /data/argus/data/dmzo.big
>>>> -rw-r--r-- 1 argus argus 516G Oct 15 08:59 /data/argus/data/dmzo.big
>>>> 
>>>> [1]+  Floating point exception(core dumped) rasplit -r /data/argus/data/dmzo.big -M time 1h -w "%Y/%m/%d/dmzo.%H.%M.%S"
>>>> 
>>>> 
>>>> Oct 15 18:07:36 secmgrprd02 kernel: [4584251.907460] traps: rasplit[5534] trap divide error ip:556d5ebf7bc4 sp:7ffc39708be0 error:0 in rasplit[556d5eb89000+b3000]
>>>> 
>>>> I tried re running the job with a -t parameter to start processing at the beginning of the hour in which it crashed with the same result, this did output some data.
>>>> 
>>>> Tried again this time starting -t the hour after the point where is crashed and it still dies and does not produce an output.
>>>> 
>>>> I conclude it is something to do with the process of reading the file that has the problem not the splitting.
>>>> 
>>>> Has anyone used rasplit on really big files?   This one is weeks data and is nearly 500GB.
>>>> 
>>>> Russell
>> 
> 
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20181018/ae4696b0/attachment.html>


More information about the argus mailing list