possible radium issue

Carter Bullard carter at qosient.com
Thu Jul 9 10:57:51 EDT 2009


Hey Phillip,
Well this is really screwy.  One thing to try is to run rasplit()  
against
the local file, to see what it does with the data.  I'm trying to  
pinpoint
if the problem is with radium(), rasplit() or the connection between the
two.

Looking at the stats you're sending, it doesn't look like this is a  
loaded
system.

Lets fix this, as radium() and rasplit() are really important  
programs, and
this type of error isn't going to cut it.

rasplit() will "cut" records if they cross the time boundary, so if a  
record
is spans 12:00 noon, (starts at 11:59:59 and ends at 12:00:01) we'll
generate two records for that one.

One thing to check is whether rasplit() is generating a file somewhere  
else
in your file system, say if the "srcid" is screwy, or if time goes to  
zero.  Are
your dates in the file name looking alright?

I would recommend that you add a few more directories in your target  
path.
Unix has a bad performance issue when the files/directory get above say
200 or so.  Thats why I add a %Y/%m/%d for the slices, so that the file
count doesn't get too  high.

Carter


On Jul 9, 2009, at 10:46 AM, Phillip Deneault wrote:

> Carter Bullard wrote:
>> Just need to find the bad file, and then try to figure out how it  
>> got corrupted
>> (at least that is my guess).
>
> So, assuming all those files were bad, and starting fresh with this  
> morning's data, I get worse, confusing results.  Clustered is still  
> being run without the '-M norep) and WithoutClustering is still just  
> the straight racount.  Data from the 8 o'clock hour:
>
> 				Clustered	WithoutClustering
> An hour of slices		756052		1250780
> An hourly file			81		82
>
> Locally generated file(-t 08)	25745		394546
>
> My gut instinct is that the locally generated file is correct again,  
> but I can't explain how an hour of slices yields _more_ records than  
> the locally generated file, especially when clustered.
>
> So, I broke them down....
>
> File		Clustered	WithoutClustering
> 00		28		28
> 10		42		42
> 20		73		75
> 30		126		126
> 40		755800		1250438
> 50		71		71
>
> In looking at the locally generated file manually, it appears the  
> distribution of flows over the hour has no such giant peak.
>
> Phil
>
>




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20090709/ce04f7a1/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3815 bytes
Desc: not available
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20090709/ce04f7a1/attachment.bin>


More information about the argus mailing list