Fwd: updates for argus-2.x compatbility and database support

Mark Bartlett mabartle at gmail.com
Thu Feb 26 11:03:26 EST 2009


Didn't copy the distro...


---------- Forwarded message ----------
From: Mark Bartlett <mabartle at gmail.com>
Date: Thu, Feb 26, 2009 at 11:01 AM
Subject: Re: [ARGUS] updates for argus-2.x compatbility and database support
To: Carter Bullard <carter at qosient.com>


Glad to hear that I'm not completely off :^P

We hold 'all' argus files on the LOADER, so we were thinking about
'rerunning' all files through the ra tools to 'combine' the streams
sometime during the day and update the db, etc...

As for 'building/dropping/adding' Partitions in mysql, it can be done
on the fly like so:

ALTER TABLE t1 ADD PARTITION (PARTITION p3 VALUES LESS THAN (2002));

You can also drop or archive a Partition, which is nice...

Another benefit of using Partitions is that you can set the 'file
location' of the partition... So with my Daily Part. Schema above I
could have today on one Hard Drive and yesterdays partition on
another, to save my read/write time on the disk when performing
queries, etc... That was the idea behind our Daily and Hourly
Databases, I would have the Daily DB running on a separate server than
the Hourly.. Then if we were seeing a HUGE data increase into the db
we would start using the separating partitions to their own hard drive
strategy...  But we're not there yet... We have 2 'probes' collecting
and see 2 million events a day (130K/HR)...

Cheers, and thanks for the helpful info..

mark

On Thu, Feb 26, 2009 at 10:37 AM, Carter Bullard <carter at qosient.com> wrote:
> Hey Mark,
> Best to keep these discussions on the mailing list.
>
> I think there are a number of different strategies that will need to be
> thought about, and using the database to its fullest is really the key.
> Do we use mysql partitioning?  I say YES!!!!! especially to support
> federated distributed database tables.  I think that is very important.
> How we do it, though, will take some discussion.
>
> So putting the data into separate tables is A form of partitioning
> (i'm starting to read about db partitioning, and it may take a little
> time for me to come to speed).  This mysql specific partitioning
> strategy, seems to have its own limits in terms of configuration.
> How do I extend the database partition description for another
> month, as time is not going to wait for me ;o)   I'm not sure that I
> can do that on the fly?  You seem to have monthly tables, that are
> partitioned by day?
>
> The best question on your archive and collection strategy is,
> "does it work?" and  if so, then you're doing great!!!!   The tool
> specifically
> designed to generate your compressed periodic files is rastream().
> I would hold off on shifting to that, though until this new release of
> client software, so you're doing fine.
>
> You are not, however aggregating any records before you put
> them in the database, so you are storing the "primitive" data into
> mysql.  That is fine, but there is other processing you can do to
> make the data a bit easier to chew, so to speak.  BUT you need
> to keep the primitive data around somewhere for a while, as that
> is eventually what is needed to investigate the really bad cases.
>
> rasqlinsert(), only creates the database or table if they are
> needed.  If the table already exists, regardless of how it was
> generated, rasqlinsert() just pokes data into it.
>
>
> Carter
>
> On Feb 26, 2009, at 8:44 AM, Mark Bartlett wrote:
>
>> Looks good.. Only problem I would have is this:
>>
>> You create a different table for 'everyday'...  I cannot 'query' the
>> database without hitting every single argus.%Y.%m.%d table
>> individually..  If we us the "Partitioning option then it takes care
>> of the 'date' stuff for us..  So I can write a query that says
>> something like, select saddr, daddr, sum(bytes) from argus_tbl where
>> (saddr = 'a source IP') group by saddr, daddr; and it would return
>> that data, under your 'schema' I would have to write it like this:
>> select saddr, daddr, sum(bytes) from argus_tbl.(DATE) where (saddr =
>> 'a source IP') group by saddr, daddr; And I would have to repeat that
>> query for 'every' date table that is created... (argus_tbl.2009-02-24,
>> argus_tbl.2009-02-23)...  Which would make it difficult for me to
>> query over the last week of data to 'see' what a specific SADDR did.
>> (who they 'talk to', what protocols 'they' use, etc)..  Essentially I
>> would like to be able to 'baseline' my network using the Argus data,
>> and have my Analysts be able to 'see' patterns from graphs using the
>> argus data..
>>
>> Thoughts??
>>
>> There is def. nothing wrong with your rasqlinsert tool, I could just
>> write my database 'schema' using a partition and my problem is solved.
>> On that note, is there any 'flag' that can be set that will NOT create
>> the table, just insert?
>>
>> [snip]
>>
>>
>> On Wed, Feb 25, 2009 at 9:33 PM, Carter Bullard <carter at qosient.com>
>> wrote:
>>>
>>> Hey Mark,
>>> There is no fixed schema, you generate whatever schema you wish.
>>> So to do something like your tables, you would use something like:
>>>   rasqlinsert -r file -w mysql://localhost/db/argus -s srcid stime dur \
>>>        saddr daddr proto ......
>>> because there is no key, rasqlinsert will just append to the table.
>>> If you want an auto incrementing identifier for the row, you can
>>> add "autoid" to the -s list and rasqlinsert will create an autoid column.
>>> I just ran this here:
>>>    ../bin/rasqlinsert -S localhost -w
>>> mysql://root@localhost/ratop/etherHosts \
>>>                 -M rmon nodrop cache -m srcid smac \
>>>                 -s stime dur srcid smac spkts dpkts sbytes dbytes state
>>> and it created this table in the ratop database (which it also created):
>>> mysql> desc etherHosts;
>>> +--------+-----------------------+------+-----+---------+-------+
>>> | Field  | Type                  | Null | Key | Default | Extra |
>>> +--------+-----------------------+------+-----+---------+-------+
>>> | stime  | double(18,6) unsigned | NO   |     | NULL    |       |
>>> | dur    | double(18,6)          | NO   |     | NULL    |       |
>>> | srcid  | varchar(64)           | NO   | PRI |         |       |
>>> | smac   | varchar(24)           | NO   | PRI |         |       |
>>> | spkts  | bigint(20)            | YES  |     | NULL    |       |
>>> | dpkts  | bigint(20)            | YES  |     | NULL    |       |
>>> | sbytes | bigint(20)            | YES  |     | NULL    |       |
>>> | dbytes | bigint(20)            | YES  |     | NULL    |       |
>>> | state  | varchar(32)           | YES  |     | NULL    |       |
>>> | record | blob                  | YES  |     | NULL    |       |
>>> +--------+-----------------------+------+-----+---------+-------+
>>> 10 rows in set (0.00 sec)
>>> the record is the binary argus record that holds the actual data.
>>>
>>> Not sure about your partitioning,  so I'm thinking something like
>>> this could work:
>>>   rasqlinsert -r file -w mysql://localhost/db/argus.%Y.%m.%d
>>> which would automatically create tablenames  using the timestamp
>>> in the argus data.  As the data comes in, rasqlinsert would create
>>> the table needed to hold the data, based on the strftime() format
>>> you provide (just like rasplit).
>>> Does that buy us anything>?
>>> Carter
>>>
>>>
>
>
>
>
>



More information about the argus mailing list