ragraph and unsorted files

Carter Bullard carter at qosient.com
Mon May 2 12:47:42 EDT 2011


Hey Rafael,
ragraph() is just a front end to rabins(), and so any problems will be caused by rabins().

I think this is a bug, so I'll take a look, to see what I can do.   rabins() is our time-series
engine, so it has a lot of bells and whistles in it.  ragraph() doesn't need all the stuff that
makes rabins() complicated, so it maybe that there is a better strategy.

The reason there is some complexity to the problem, is that we want the approach
rabins() uses for bin management to be able to work with both streaming data and
file based data.  With infinite streaming data, you need to be concerned with memory
management, so our strategy is to have a "silding window" type of data processing,
for aggregation etc...   As you suggested, the problem is rabins() is not allowing for
a large window, when processing files.

If you were to give ragraph() an explicit time range to graph, this problem
would go away.

So, the client library supports the notion of multi-pass processing of files.
If you look at the source code, all clients have a variable ArgusPassNum, and if
in your own clients initialization routine, you defined that to be 2, as an example, we
would process the input file list twice.  I could use that to simply scan the data from the
file list on the first pass to set the time series start and stop times, and then run the data
through again to tally the results, but the performance can be pretty bad if I do that
as a general strategy.  But that would be faster than if we had to sort the data prior to graphing it.

I'll look to see if this is a bug, or a feature.  How wildly out of order are the records?

Carter


On May 2, 2011, at 11:26 AM, Rafael Barbosa wrote:

> Hi all,
> 
> I run into something today that might be considered a bug: ragraph does not handle well files that are not ordered by 'stime'. Basically it seems that ragraph uses the info of the first record to initialize the timeseries, so flows that are before in time (but later in the file) are ignored, or at least erroneously processed.
> 
> I upload the file 'ragraph-unsorted.zip' to ftp://qosient.com/incoming that contains an example.
> 
> An easy work around is to make sure that the file is ordered, with rasort(), before using ragraph. E.g.:
> rasort -m stime -r flows.argus -w sorted.argus
> 
> Best regards,
> Rafael Barbosa
> http://www.vf.utwente.nl/~barbosarr/
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20110502/7e3db721/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4367 bytes
Desc: not available
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20110502/7e3db721/attachment.bin>


More information about the argus mailing list