rabins issue, maybe order related
Mark E. Mallett
mem at mv.mv.com
Wed Jun 20 12:57:38 EDT 2012
Hi,
I'm having an odd issue with rabins. I'm trying to process
multiple files and directories with a mix of "-r" and "-R"
options. If the response at this point is "don't do that,"
then that can be that and you can stop reading :)
However, other ra* tools seem happy with the mix.
I'm not sure I can construct a test with your sample data, so I'd just
like to present the alleged problem in narrative form first. I can
probably provide data if necessary.
Say I have an argus file at /usr/local/argus/summary/20120601 .
As you might imagine, it contains records generated on June 1, 2012.
Say I have a directory at /usr/local/argus/archive/2012/06/20 . There
are argus files in this directory (or even just one file) that
contain records generated on June 20, 2012.
Step 1.
I run rabins to look at simple daily data from those files. The actual
command line is more elaborate but I reduced it to this and the issue
is still present, so:
$ rabins -L-1 -u -m proto saddr -M time 1d \
-r /usr/local/argus/summary/20120601 \
-R /usr/local/argus/archive/2012/06/20 \
-s stime saddr bytes:12 trans:18 - ip |\
dtm-c1
where 'dtm-c1' is a simple script to convert the epoch time in the
first column to a date format that I am interested in.
This invocation only shows me records from June 20, i.e. from the -R
directory. There's nothing from June 1 (the -r file).
Step 2.
I somehow discovered that if I make a symbolic link:
ln -s /usr/local/argus/summary/20120601 /tmp/20120601
and rerun it:
$ rabins -L-1 -u -m proto saddr -M time 1d \
-r /tmp/20120601 \
-R /usr/local/argus/archive/2012/06/20 \
-s stime saddr bytes:12 trans:18 - ip |\
dtm-c1
I get all the records. Likewise if I copy the file instead of link it.
And likewise if instead of using -R for the June 20 directory, if I
use one or more -r options for individual files in that directory, in
all those cases I get all records like I hope to.
Step 3. Observation.
I ran strace against the various invocations and compared what was
happening. The only significant thing that I can see is that the order
of the opening of the files changes. I note that /usr/local/argus/archive/
comes alphabetically before /usr/local/argus/summary/ and that
/tmp/20120601 comes alphabetically before /usr/local/anything - I'm
guessing that there's some sorting going on, and it appears that when I
move the -r file or the file reference to /tmp, it gets opened before
the -R directories does, whereas in the original command, they sort
in the opposite order and are opened in the opposite order.
The -r file is always being opened and read; it just seems to matter what
order it's opened in.
I hope that this is reasonably clear and that it might even make sense.
I can probably just use '-r' everywhere (expand the -R directories) but
it makes for much longer command lines, and there's other boring reasons
that I want to use -R as well. But this does make me wonder if I need
to open files in date order (or in bin order, at least). I scoured the
rabins man page to see if I could find a mention of that but didn't see
one. I note that '-r' is documented as opening files in the order
specified, but I don't see any mention of order of multiple -R
directories or for a mix of -r and -R. The command line I build actually
does specify all files and directories in date order.
Yours,
-mm-
More information about the argus
mailing list