"version 4.1" files ? + file size remark

Stéphane Peters stephane.peters at forem.be
Tue Feb 13 18:39:55 EST 2007


Hello Carter,

nice shot!
Good news for long-term runs !

Here is another view on my test directory,
that shows the different files after running your command.
I also have compressed the files and ra-counted them.
You will see that once striped you could store nearly twice the quantity 
of files.

If we compare their size with the reference, the V2.0.6 file (75MB), we 
have:

    V2 plain: 100%
    V3 plain: 133%
    V3 stripped: 65%
    V2 stripped: 66% (with rastrip V2.0.6, equivalent args)
    V2 stripped,gzipped: 22%
    V2 gzipped: 37%
    V3 gzipped: 37%
    V3 stripped,gzipped: 21%
    V3 stripped,gzipped vs V2 gzipped : 57%

Since "-M -net" doesn't exist in V2.0.6, I have used this command:
     rastrip2 -r argus-2.0.6-data -M -tcp -M -esp -M -rtp -M -icmp -w 
argus-2.0.6-data-strip

So, if you don't need TCP state etc, stripping is another good choice
to keep more files online.

-------------- Extract: ------------
[argus at argus-fedora t]$ ls -l | sed -e 's/-rw-r--r--  1 argus argusg/(...)/'
total 408852
(...)  79131992 Jan 29 12:54 argus-2.0.6-data
(...)  52701864 Feb 13 18:17 argus-2.0.6-data-strip
(...)  18054642 Feb 13 18:18 argus-2.0.6-data-strip.gz
(...)       235 Feb  9 22:40 argus-2.0.6-data-strip.racount
(...)  29385568 Jan 29 12:54 argus-2.0.6-data.gz
(...)       239 Feb  9 22:58 argus-2.0.6-data.gz.racount
(...)       235 Feb  9 22:47 argus-2.0.6-data.racount
(...) 105565848 Feb  9 22:48 converted-argus-3.0-data
(...)  29640207 Feb  9 22:48 converted-argus-3.0-data.gz
(...)       239 Feb  9 22:48 converted-argus-3.0-data.racount
(...)  52109084 Feb 13 16:58 converted-argus-3.0-strip
(...)  16788510 Feb 13 17:08 converted-argus-3.0-strip.gz
(...)       239 Feb 13 17:11 converted-argus-3.0-strip.racount
[argus at argus-fedora t]$  (head -1 argus-2.0.6-data.racount ; \
diff3 argus-2.0.6-data.racount converted-argus-3.0-data.racount 
converted-argus-3.0-strip.racount)\
| sed -e 's/  */ /g'
racount records total_pkts src_pkts dst_pkts total_bytes src_bytes dst_bytes
====
1:2c
 sum 506501 27657564 13448013 14209551 11291881828 2571549032 8720332796
2:2c
 sum 506501 27657564 13448013 14209551 11291881828 2571548170 8720333658
3:2c
 sum 506453 27657564 13448013 14209551 11291881828 2571548170 8720333658

[argus at argus-fedora t]$
-------------- End of extract ------------

The fields: total_pkts,src_pkts,dst_pkts,total_bytes did not change 
throughout the operation;
862 bytes changed their direction between V2 and V3,
and 48 records were lost or aggregated  after striping.
Still not significant for me, it shouldn't be a problem.

As for the rest of your mail, I see that you already have ideas for the 
next release !
You are right, it seems better to start by reducing redundancy in the files,
and dropping as much as possible on each record, without compromising 
performance,
before the gzip library (for example) could be used for generating 
compressed files on the fly.

I agree with your idea of a striped-down but expandable TCP data struct.
Perhaps another variable in the argus.conf file? with a line like this:
    ARGUS_FILE_INCLUDED_FIELDS="stime proto saddr sport dir daddr dport 
spkts dpkts sbytes dbytes state"
with 3 basic sets of fields included in the argus file structure:
- minimal : stime proto saddr sport daddr dport (just to have a baseline)
- standard : ratop displayed fields
- complete : all, to practice
But I know that must be too simple...

The srcid field could be transformed to a srcid record, with some 
metrics in it for example.
That should fit the needs until you start to merge streams from 
different probes,
where you could revert to the previous data struct that includes srcid.




Carter Bullard wrote :

> Hey Stephane,
> No the version should revert to 3.0.  The 4.1 is an accidental carryover
> from gargoyle, which is argus-4.0.  I'll change that right now!!!
>
> The size change is predominately a result of additional data being stored
> for TCP and additional timestamping.  This data is not probably
> interesting for everyone, so we may want to consider another default
> set of metrics for TCP.  Do an experiment with rastrip(), and take out the
> tcp/net structure:
>
>    rastrip -r converted-argus-3.0-data -M -net -w 
> converted-argus-3.0-strip
>
> And see if you get significant reduction.  By throwing away this data, 
> what you lose is TCP state, which is important, performance data, like
> round trip times, the window stats, such as window sizes, and ack'd bytes,
> base sequence numbers.  This structure will have the OS fingerprinting
> data, which will be in the next round.
>
> I can make a default TCP data struct so that argus reports less 
> information,
> and have it report TCP performance if you want it to.
>
> Also the adaptive compression for data can be improved.  We currently
> report time as 16 bytes, start time secs and usecs as unsigned ints (8 
> bytes)
> and the stop times (another 8 bytes).  We could do it in 12 bytes for most
> records, and that would be helpful.
>
> In the next round, argus-3.1, we'll start to put in file context 
> compression,
> which will remove objects that are repetitive, such as the source id, and
> we'll be able to reduce timestamps by a huge amount.  I'd like to put this
> stuff in in the next round.
>
> What do you think?
>
>
> Carter
>
>
> On Feb 12, 2007, at 2:36 PM, Stéphane Peters wrote:
>
>> Hello Carter,
>>
>> may I add my 5 cents before the last day...
>>
>> Will you keep version numbering "4.1" for files generated by an 
>> argus-3.0 client for the public release ?
>> I was surprised not to find V3 files when working on both types files.
>>
>> Here are some commands in my test directory:
>>
>>> [argus at argus-fedora t]% ra3 -w converted-argus-3.0-data -nr 
>>> argus-2.0.6-data -
>>> [argus at argus-fedora t]% file *data
>>> argus-2.0.6-data:         Argus data - version 2.0
>>> converted-argus-3.0-data: Argus data - version 4.1
>>> [argus at argus-fedora t]% ls -l
>>> total 238288
>>> -rw-r--r--  1 argus argusg  79131992 jan 29 12:54 argus-2.0.6-data
>>> -rw-r--r--  1 argus argusg  29385568 jan 29 12:54 argus-2.0.6-data.gz
>>> -rw-r--r--  1 argus argusg       239 fév  9 22:58 
>>> argus-2.0.6-data.gz.racount
>>> -rw-r--r--  1 argus argusg       235 fév  9 22:47 
>>> argus-2.0.6-data.racount
>>> -rw-r--r--  1 argus argusg 105565848 fév  9 22:48 
>>> converted-argus-3.0-data
>>> -rw-r--r--  1 argus argusg  29640207 fév  9 22:48 
>>> converted-argus-3.0-data.gz
>>> -rw-r--r--  1 argus argusg       239 fév  9 22:48 
>>> converted-argus-3.0-data.racount
>>> [argus at argus-fedora t]%
>>
>>
>> I also was surprised to see the growth of my files (33%).
>> But after compression, the difference of size is fairly unnoticeable 
>> (<1%).
>>
>>
>> Regards,
>>
>> -- 
>> Stephane.Peters at forem.be <mailto:Stephane.Peters at forem.be>
>>
>>
>
> Carter Bullard
> CEO/President
> QoSient, LLC
> 150 E. 57th Street Suite 12D
> New York, New York 10022
>
> +1 212 588-9133 Phone
> +1 212 588-9134 Fax
>
>

Regards,

-- 
Stephane.Peters at forem.be

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20070214/9f059810/attachment.html>


More information about the argus mailing list