reading argus files using non-sequential access
Karl Tatgenhorst
karlt at uchicago.edu
Thu Feb 1 10:27:10 EST 2007
The files I generate are large (> 2GB) and I can see using the offsets
as a tremendous boon for reanalyzing data, but only if I have a way of
displaying the offset originally. Then when someone wants to look at my
data they can start at the point in the file where the actual data shows
up, shaving seconds off their seek time.
Karl
On Thu, 2007-02-01 at 11:20 +0000, carter at qosient.com wrote:
> Hey Mark,
> Byte offset access causes the ra* program to fseek() ostart bytes before it starts to process the records. Without this option, argus reads the entire file to find records to process.
>
> The "-t" option isn't "time offset", its more like "time range". With "-t" by itself, ra() will sewuentially read the entire file testing each record's bounds. You have to do this if the file isn't sorted at all. The byte offset option let's you skip (potentially huge) chunks of the file, if you know that you can, and where to skip to.
>
> So yes, if the file is large (» .5M) the speed ups can be significant!!
>
> Carter
>
> Carter Bullard
> QoSient LLC
> 150 E. 57th Street Suite 12D
> New York, New York 10022
> +1 212 588-9133 Phone
> +1 212 588-9134 Fax
>
> -----Original Message-----
> From: "Mark Poepping" <poepping at cmu.edu>
> Date: Wed, 31 Jan 2007 23:56:40
> To:"'Carter Bullard'" <carter at qosient.com>
> Cc:"'Argus'" <argus-info at lists.andrew.cmu.edu>
> Subject: RE: [ARGUS] reading argus files using non-sequential access
>
>
> How would byte offset be more valuable than time-offset? Does it end up
> being much faster?
> Mark.
>
> > -----Original Message-----
> > From: argus-info-bounces at lists.andrew.cmu.edu [mailto:argus-info-
> > bounces at lists.andrew.cmu.edu] On Behalf Of Carter Bullard
> > Sent: Wednesday, January 31, 2007 9:36 PM
> > To: Argus
> > Subject: [ARGUS] reading argus files using non-sequential access
> >
> > Gentle people,
> > All ra* programs have the ability to read argus files using starting
> > and ending byte offsets. If you have a list of offsets, this type
> > of feature can make processing large argus files very fast/efficient.
> >
> > The syntax for reading files using offsets has been/is/will be/could be:
> > "-r file::ostart:oend"
> >
> > (or at least that is how I've implemented it in the past)
> > where ostart is the starting offset, and oend is the ending offset.
> >
> > This is not a useful feature if you don't know where the record
> > boundaries are in the file, so I haven't 'exposed' this feature yet, but
> > I think that it is something that we can begin to work with, or at
> > least talk about how we could use it.
> >
> > Anyone interested in this type of feature and would like to
> > talk about how we could use this?
> >
> > Carter
> >
> >
> >
> >
> >
> >
> >
> >
>
>
>
More information about the argus
mailing list