"version 4.1" files ? + file size remark
Carter Bullard
carter at qosient.com
Tue Feb 13 23:52:46 EST 2007
Hey Stephane,
I have added a new variable to the argus.conf configuration file;
"ARGUS_GENERATE_TCP_PERF_METRIC", which allows you
to specify whether you want the big TCP buffer (set variable to "yes"),
or a very small one that still provides the flags and status (variable
set to "no"). This will help to keep your records smaller.
I will add a way in rastrip() to convert the large TCP info buffer
to the smaller one if you want to retain the flags data, but get rid
of the base sequence numbers etc.....
Glad to see that things are better. Do review what is lost when
you strip data, so that you don't throw away something that you
really wanted to keep around, and of course, send mail often.
Carter
On Feb 13, 2007, at 6:39 PM, Stéphane Peters wrote:
> Hello Carter,
>
> nice shot!
> Good news for long-term runs !
>
> Here is another view on my test directory,
> that shows the different files after running your command.
> I also have compressed the files and ra-counted them.
> You will see that once striped you could store nearly twice the
> quantity of files.
>
> If we compare their size with the reference, the V2.0.6 file
> (75MB), we have:
> V2 plain: 100%
> V3 plain: 133%
> V3 stripped: 65%
> V2 stripped: 66% (with rastrip V2.0.6, equivalent args)
> V2 stripped,gzipped: 22%
> V2 gzipped: 37%
> V3 gzipped: 37%
> V3 stripped,gzipped: 21%
> V3 stripped,gzipped vs V2 gzipped : 57%
> Since "-M -net" doesn't exist in V2.0.6, I have used this command:
> rastrip2 -r argus-2.0.6-data -M -tcp -M -esp -M -rtp -M -icmp -
> w argus-2.0.6-data-strip
>
> So, if you don't need TCP state etc, stripping is another good choice
> to keep more files online.
>
> -------------- Extract: ------------
> [argus at argus-fedora t]$ ls -l | sed -e 's/-rw-r--r-- 1 argus
> argusg/(...)/'
> total 408852
> (...) 79131992 Jan 29 12:54 argus-2.0.6-data
> (...) 52701864 Feb 13 18:17 argus-2.0.6-data-strip
> (...) 18054642 Feb 13 18:18 argus-2.0.6-data-strip.gz
> (...) 235 Feb 9 22:40 argus-2.0.6-data-strip.racount
> (...) 29385568 Jan 29 12:54 argus-2.0.6-data.gz
> (...) 239 Feb 9 22:58 argus-2.0.6-data.gz.racount
> (...) 235 Feb 9 22:47 argus-2.0.6-data.racount
> (...) 105565848 Feb 9 22:48 converted-argus-3.0-data
> (...) 29640207 Feb 9 22:48 converted-argus-3.0-data.gz
> (...) 239 Feb 9 22:48 converted-argus-3.0-data.racount
> (...) 52109084 Feb 13 16:58 converted-argus-3.0-strip
> (...) 16788510 Feb 13 17:08 converted-argus-3.0-strip.gz
> (...) 239 Feb 13 17:11 converted-argus-3.0-strip.racount
> [argus at argus-fedora t]$ (head -1 argus-2.0.6-data.racount ; \
> diff3 argus-2.0.6-data.racount converted-argus-3.0-data.racount
> converted-argus-3.0-strip.racount)\
> | sed -e 's/ */ /g'
> racount records total_pkts src_pkts dst_pkts total_bytes src_bytes
> dst_bytes
> ====
> 1:2c
> sum 506501 27657564 13448013 14209551 11291881828 2571549032
> 8720332796
> 2:2c
> sum 506501 27657564 13448013 14209551 11291881828 2571548170
> 8720333658
> 3:2c
> sum 506453 27657564 13448013 14209551 11291881828 2571548170
> 8720333658
>
> [argus at argus-fedora t]$
> -------------- End of extract ------------
>
> The fields: total_pkts,src_pkts,dst_pkts,total_bytes did not change
> throughout the operation;
> 862 bytes changed their direction between V2 and V3,
> and 48 records were lost or aggregated after striping.
> Still not significant for me, it shouldn't be a problem.
>
> As for the rest of your mail, I see that you already have ideas for
> the next release !
> You are right, it seems better to start by reducing redundancy in
> the files,
> and dropping as much as possible on each record, without
> compromising performance,
> before the gzip library (for example) could be used for generating
> compressed files on the fly.
>
> I agree with your idea of a striped-down but expandable TCP data
> struct.
> Perhaps another variable in the argus.conf file? with a line like
> this:
> ARGUS_FILE_INCLUDED_FIELDS="stime proto saddr sport dir daddr
> dport spkts dpkts sbytes dbytes state"
> with 3 basic sets of fields included in the argus file structure:
> - minimal : stime proto saddr sport daddr dport (just to have a
> baseline)
> - standard : ratop displayed fields
> - complete : all, to practice
> But I know that must be too simple...
>
> The srcid field could be transformed to a srcid record, with some
> metrics in it for example.
> That should fit the needs until you start to merge streams from
> different probes,
> where you could revert to the previous data struct that includes
> srcid.
>
>
>
>
> Carter Bullard wrote :
>> Hey Stephane,
>> No the version should revert to 3.0. The 4.1 is an accidental
>> carryover
>> from gargoyle, which is argus-4.0. I'll change that right now!!!
>>
>> The size change is predominately a result of additional data being
>> stored
>> for TCP and additional timestamping. This data is not probably
>> interesting for everyone, so we may want to consider another default
>> set of metrics for TCP. Do an experiment with rastrip(), and take
>> out the
>> tcp/net structure:
>>
>> rastrip -r converted-argus-3.0-data -M -net -w converted-
>> argus-3.0-strip
>>
>> And see if you get significant reduction. By throwing away this
>> data,
>> what you lose is TCP state, which is important, performance data,
>> like
>> round trip times, the window stats, such as window sizes, and
>> ack'd bytes,
>> base sequence numbers. This structure will have the OS
>> fingerprinting
>> data, which will be in the next round.
>>
>> I can make a default TCP data struct so that argus reports less
>> information,
>> and have it report TCP performance if you want it to.
>>
>> Also the adaptive compression for data can be improved. We currently
>> report time as 16 bytes, start time secs and usecs as unsigned
>> ints (8 bytes)
>> and the stop times (another 8 bytes). We could do it in 12 bytes
>> for most
>> records, and that would be helpful.
>>
>> In the next round, argus-3.1, we'll start to put in file context
>> compression,
>> which will remove objects that are repetitive, such as the source
>> id, and
>> we'll be able to reduce timestamps by a huge amount. I'd like to
>> put this
>> stuff in in the next round.
>>
>> What do you think?
>>
>>
>> Carter
>>
>>
>> On Feb 12, 2007, at 2:36 PM, Stéphane Peters wrote:
>>
>>> Hello Carter,
>>>
>>> may I add my 5 cents before the last day...
>>>
>>> Will you keep version numbering "4.1" for files generated by an
>>> argus-3.0 client for the public release ?
>>> I was surprised not to find V3 files when working on both types
>>> files.
>>>
>>> Here are some commands in my test directory:
>>>
>>>> [argus at argus-fedora t]% ra3 -w converted-argus-3.0-data -nr
>>>> argus-2.0.6-data -
>>>> [argus at argus-fedora t]% file *data
>>>> argus-2.0.6-data: Argus data - version 2.0
>>>> converted-argus-3.0-data: Argus data - version 4.1
>>>> [argus at argus-fedora t]% ls -l
>>>> total 238288
>>>> -rw-r--r-- 1 argus argusg 79131992 jan 29 12:54 argus-2.0.6-data
>>>> -rw-r--r-- 1 argus argusg 29385568 jan 29 12:54 argus-2.0.6-
>>>> data.gz
>>>> -rw-r--r-- 1 argus argusg 239 fév 9 22:58 argus-2.0.6-
>>>> data.gz.racount
>>>> -rw-r--r-- 1 argus argusg 235 fév 9 22:47 argus-2.0.6-
>>>> data.racount
>>>> -rw-r--r-- 1 argus argusg 105565848 fév 9 22:48 converted-
>>>> argus-3.0-data
>>>> -rw-r--r-- 1 argus argusg 29640207 fév 9 22:48 converted-
>>>> argus-3.0-data.gz
>>>> -rw-r--r-- 1 argus argusg 239 fév 9 22:48 converted-
>>>> argus-3.0-data.racount
>>>> [argus at argus-fedora t]%
>>>
>>> I also was surprised to see the growth of my files (33%).
>>> But after compression, the difference of size is fairly
>>> unnoticeable (<1%).
>>>
>>>
>>> Regards,
>>>
>>> --
>>> Stephane.Peters at forem.be
>>>
>>>
>>
>>
>
> Regards,
> --
> Stephane.Peters at forem.be
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20070213/69d9ac48/attachment.html>
More information about the argus
mailing list