possible radium issue
Carter Bullard
carter at qosient.com
Thu Jul 9 10:57:51 EDT 2009
Hey Phillip,
Well this is really screwy. One thing to try is to run rasplit()
against
the local file, to see what it does with the data. I'm trying to
pinpoint
if the problem is with radium(), rasplit() or the connection between the
two.
Looking at the stats you're sending, it doesn't look like this is a
loaded
system.
Lets fix this, as radium() and rasplit() are really important
programs, and
this type of error isn't going to cut it.
rasplit() will "cut" records if they cross the time boundary, so if a
record
is spans 12:00 noon, (starts at 11:59:59 and ends at 12:00:01) we'll
generate two records for that one.
One thing to check is whether rasplit() is generating a file somewhere
else
in your file system, say if the "srcid" is screwy, or if time goes to
zero. Are
your dates in the file name looking alright?
I would recommend that you add a few more directories in your target
path.
Unix has a bad performance issue when the files/directory get above say
200 or so. Thats why I add a %Y/%m/%d for the slices, so that the file
count doesn't get too high.
Carter
On Jul 9, 2009, at 10:46 AM, Phillip Deneault wrote:
> Carter Bullard wrote:
>> Just need to find the bad file, and then try to figure out how it
>> got corrupted
>> (at least that is my guess).
>
> So, assuming all those files were bad, and starting fresh with this
> morning's data, I get worse, confusing results. Clustered is still
> being run without the '-M norep) and WithoutClustering is still just
> the straight racount. Data from the 8 o'clock hour:
>
> Clustered WithoutClustering
> An hour of slices 756052 1250780
> An hourly file 81 82
>
> Locally generated file(-t 08) 25745 394546
>
> My gut instinct is that the locally generated file is correct again,
> but I can't explain how an hour of slices yields _more_ records than
> the locally generated file, especially when clustered.
>
> So, I broke them down....
>
> File Clustered WithoutClustering
> 00 28 28
> 10 42 42
> 20 73 75
> 30 126 126
> 40 755800 1250438
> 50 71 71
>
> In looking at the locally generated file manually, it appears the
> distribution of flows over the hour has no such giant peak.
>
> Phil
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20090709/ce04f7a1/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3815 bytes
Desc: not available
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20090709/ce04f7a1/attachment.bin>
More information about the argus
mailing list