reading argus files using non-sequential access
Philipp E. Letschert
phil at uni-koblenz.de
Fri Feb 2 05:49:19 EST 2007
I've played around for generating an offset table from a given file,
I think this should be a functionality somewhere in the argus-clients,
probably in racount, to generate a list of offsets and export it as a
string to use it in exteranl applications.
A cool feature for ra would be head and tail and ranges by index.
# perl snippet to generate an offset table
# just for fun, since I guess this needs five-fold the time of
# an implementation in C. Perl sucks, no future for ArgusEye?
use FileHandle;
use Fcntl qw(SEEK_SET);
use vars qw(@offsets);
my $argus_file = ARGV[0] or die("no file");
sub gen_offsets {
my ($bytes, $type, $cause, $len, $offset);
open(my $fh, '<', $argus_file);
binmode($fh);
@offsets = (0);
$fh->sysread($bytes, 4);
($type, $cause, $len) = unpack("H2 H2 H4", $bytes);
$offset += hex($len) * 4;
while (! $fh->eof ) {
$fh->seek($offset, SEEK_SET);
$fh->sysread($bytes, 4);
($type, $cause, $len) = unpack("H2 H2 H4", $bytes);
if (hex($len) > 0) {
$offset += hex($len) * 4;
push(@offsets, $offset);
}
}
close($fh);
}
&gen_offsets;
print "offsets for " . scalar(@offsets) . " records generated.\n";
On Thu, Feb 01, 2007 at 03:41:31PM +0100, Philipp Letschert wrote:
> Hey this is great!
>
> this feature would make it possible to handle larger files in ArgusEye.
>
> At moment it reads all available fields of each transaction into memory and
> builds a complete view. This is stupid, time-consuming, eating up all the memory
> and allows only a limited file size. To improve that I could try to build a list
> of offsets and do the partial reading with ra only for the rows that actually
> fit into the view, as you suggested some months ago. This seems realizable to
> me, but this would only make scrolling of the rows
> possible, but inspecting a large file by scrolling millions of rows doesn't seem
> very inspiring to me...
>
> So how to use that for sorting and filtering? For the display filter I can
> imagine just applying a ra filter expression. That would be a good solution
> anyway, because my current attempt to do filtering with an acceptable
> performance in Perl is anything but successful.
> As the filtering is done while scrolling the view, there would be no information
> available
> how many transactions are affected by the filter, but thats affordable.
>
> And sorting? I like sorting of transactions in the view, as this is helpful for
> finding patterns, and it should be possible for a filtered display as well, but
> with partial reading there is no information on a transactions position in a
> sorted context. I can imagine reading sort keys from the file when the list of
> offsets is generated or to use rasort and generate a new list of offsets, but
> both seems very time-consuming to me...
>
>
> Thanks for revealing that feature, this will help for a better GUI!
>
> - but wait, didn't I promise to help in documentation? Shame on me, this should
> now really be the next task for me...
>
>
> Philipp
>
>
> On Wed, Jan 31, 2007 at 09:36:05PM -0500, Carter Bullard wrote:
> > Gentle people,
> > All ra* programs have the ability to read argus files using starting
> > and ending byte offsets. If you have a list of offsets, this type
> > of feature can make processing large argus files very fast/efficient.
> >
> > The syntax for reading files using offsets has been/is/will be/could be:
> > "-r file::ostart:oend"
> >
> > (or at least that is how I've implemented it in the past)
> > where ostart is the starting offset, and oend is the ending offset.
> >
> > This is not a useful feature if you don't know where the record
> > boundaries are in the file, so I haven't 'exposed' this feature yet, but
> > I think that it is something that we can begin to work with, or at
> > least talk about how we could use it.
> >
> > Anyone interested in this type of feature and would like to
> > talk about how we could use this?
> >
> > Carter
> >
More information about the argus
mailing list