argus aggregation and CIDR addresses

Tue Aug 3 13:58:29 EDT 2010

Gentle people,
I've added "CIDR address format" printing for IPv4 addresses and will have it for IPv6
addresses later in the week.  I would like to know if anyone has an opinion as to whether
it should be the default printing mode for ra* programs.  

Currently we do not print aggregated IP addresses using CIDR formats.  While the CIDR
mask length has been available in aggregated argus data, it was not uniformly preserved
in all operations.  That has been resolved, and for data aggregated using argus-clients-3.0.4,
the address mask length should be considered to be reliable. 

In order to maintain legacy behavior, there are three modes for printing CIDR addresses,
and they are configured using the RA_CIDR_ADDRESS_FORMAT  variable in the ~/.rarc file. 

   1) Printing disabled, "no", where we will not report the mask. (legacy mode)
   2) Printing enabled, "yes" where we will print "/masklen" when the mask is < full address bits.
   3) Printing enabled, "strict", where we will always print the "/masklen".

The idea behind the "yes" and "strict", is that unless you aggregate the data, all IP addresses
in the flow data have full address bit CIDR masklens, so no need to print the "/32", or "/128" .
One the other hand, some people don't like variability in their output formats, so forcing the "/%d"
to be at the end of every IP address could be a desired feature, for some.

I will leave the default to "no" unless we can come to consensus that "yes" is appropriate.

This is important, as we start to work on IP address indexing, as we have for time indexing.
This effort will be very interesting, but before we start, being able to print the CIDR masklen is
going to be really important.

Hope all is most excellent, and if you do have an opinion, don't hold it back.

Carter

Here is a simple description of how we could do IP address indexing.  I'm going down this path:

One simple strategy for IP address indexing, is to have a mysql table, for each day, that has
entries for the occurences of all the /16 CIDR address aggregates.  This fixed address strategy
has some advantages, primarily, it limits the database to a maximum of 64K entries per index,
which is a good thing.   Using our existing database tool, rasqlinsert(), we can formulate the address
aggregates, and poke the aggregate argus records into a single table, and get some really good information.

   rasqlindex -M rmon -m srcid smac saddr/16 -s stime dur srcid smac saddr -M cache time 1d -w mysql://user@host/db/ipIndex_%Y_%m_%d 

at anytime, we can search the table for the occurence of the /16 network for an address, and if its
a relatively unique address, we'll be able to find it very quickly:

   rasql -M time 1d -r mysql://user@host/db/ipIndex_%Y_%m_%d -t -30d -M sql="saddr='network in question/16'"

A more elegant solution would allow us to have different CIDR mask lengths, depending on how
many addresses are represented by the aggregate, and the duration of the aggregate.  If having using
a "/8" entry keeps the range of the aggregate to, say 30 seconds in a day, then that is a good index
representation.  If the "/8" covers the whole day, but a "/9" generates two ranges, a short on in the
morning and a short one in the evening, then using "/9" for the index would be the right thing to do.
We'll be developing IP address indexing strategies that will try to minimize the number of entries,
but also minimize the time range covered by the entries.  

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3815 bytes
Desc: not available
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20100803/f4a56c8f/attachment.bin>