Fwd: updates for argus-2.x compatbility and database support

Thu Feb 26 11:27:22 EST 2009

Hey Mark,
Well, it looks pretty simple to add a partition directive at table  
creation,
but since its easy to do it on the fly, we may not even need to do
it in rasqlinsert() ;o)

If we come up with a strategy, I can add any kind of string to the table
creation command, so....., its just a matter of finding out what will be
a good thing to do.

Carter

On Feb 26, 2009, at 11:03 AM, Mark Bartlett wrote:

> Didn't copy the distro...
>
>
> ---------- Forwarded message ----------
> From: Mark Bartlett <mabartle at gmail.com>
> Date: Thu, Feb 26, 2009 at 11:01 AM
> Subject: Re: [ARGUS] updates for argus-2.x compatbility and database  
> support
> To: Carter Bullard <carter at qosient.com>
>
>
> Glad to hear that I'm not completely off :^P
>
> We hold 'all' argus files on the LOADER, so we were thinking about
> 'rerunning' all files through the ra tools to 'combine' the streams
> sometime during the day and update the db, etc...
>
> As for 'building/dropping/adding' Partitions in mysql, it can be done
> on the fly like so:
>
> ALTER TABLE t1 ADD PARTITION (PARTITION p3 VALUES LESS THAN (2002));
>
> You can also drop or archive a Partition, which is nice...
>
> Another benefit of using Partitions is that you can set the 'file
> location' of the partition... So with my Daily Part. Schema above I
> could have today on one Hard Drive and yesterdays partition on
> another, to save my read/write time on the disk when performing
> queries, etc... That was the idea behind our Daily and Hourly
> Databases, I would have the Daily DB running on a separate server than
> the Hourly.. Then if we were seeing a HUGE data increase into the db
> we would start using the separating partitions to their own hard drive
> strategy...  But we're not there yet... We have 2 'probes' collecting
> and see 2 million events a day (130K/HR)...
>
> Cheers, and thanks for the helpful info..
>
> mark
>
> On Thu, Feb 26, 2009 at 10:37 AM, Carter Bullard  
> <carter at qosient.com> wrote:
>> Hey Mark,
>> Best to keep these discussions on the mailing list.
>>
>> I think there are a number of different strategies that will need  
>> to be
>> thought about, and using the database to its fullest is really the  
>> key.
>> Do we use mysql partitioning?  I say YES!!!!! especially to support
>> federated distributed database tables.  I think that is very  
>> important.
>> How we do it, though, will take some discussion.
>>
>> So putting the data into separate tables is A form of partitioning
>> (i'm starting to read about db partitioning, and it may take a little
>> time for me to come to speed).  This mysql specific partitioning
>> strategy, seems to have its own limits in terms of configuration.
>> How do I extend the database partition description for another
>> month, as time is not going to wait for me ;o)   I'm not sure that I
>> can do that on the fly?  You seem to have monthly tables, that are
>> partitioned by day?
>>
>> The best question on your archive and collection strategy is,
>> "does it work?" and  if so, then you're doing great!!!!   The tool
>> specifically
>> designed to generate your compressed periodic files is rastream().
>> I would hold off on shifting to that, though until this new release  
>> of
>> client software, so you're doing fine.
>>
>> You are not, however aggregating any records before you put
>> them in the database, so you are storing the "primitive" data into
>> mysql.  That is fine, but there is other processing you can do to
>> make the data a bit easier to chew, so to speak.  BUT you need
>> to keep the primitive data around somewhere for a while, as that
>> is eventually what is needed to investigate the really bad cases.
>>
>> rasqlinsert(), only creates the database or table if they are
>> needed.  If the table already exists, regardless of how it was
>> generated, rasqlinsert() just pokes data into it.
>>
>>
>> Carter
>>
>> On Feb 26, 2009, at 8:44 AM, Mark Bartlett wrote:
>>
>>> Looks good.. Only problem I would have is this:
>>>
>>> You create a different table for 'everyday'...  I cannot 'query' the
>>> database without hitting every single argus.%Y.%m.%d table
>>> individually..  If we us the "Partitioning option then it takes care
>>> of the 'date' stuff for us..  So I can write a query that says
>>> something like, select saddr, daddr, sum(bytes) from argus_tbl where
>>> (saddr = 'a source IP') group by saddr, daddr; and it would return
>>> that data, under your 'schema' I would have to write it like this:
>>> select saddr, daddr, sum(bytes) from argus_tbl.(DATE) where (saddr =
>>> 'a source IP') group by saddr, daddr; And I would have to repeat  
>>> that
>>> query for 'every' date table that is created... (argus_tbl. 
>>> 2009-02-24,
>>> argus_tbl.2009-02-23)...  Which would make it difficult for me to
>>> query over the last week of data to 'see' what a specific SADDR did.
>>> (who they 'talk to', what protocols 'they' use, etc)..   
>>> Essentially I
>>> would like to be able to 'baseline' my network using the Argus data,
>>> and have my Analysts be able to 'see' patterns from graphs using the
>>> argus data..
>>>
>>> Thoughts??
>>>
>>> There is def. nothing wrong with your rasqlinsert tool, I could just
>>> write my database 'schema' using a partition and my problem is  
>>> solved.
>>> On that note, is there any 'flag' that can be set that will NOT  
>>> create
>>> the table, just insert?
>>>
>>> [snip]
>>>
>>>
>>> On Wed, Feb 25, 2009 at 9:33 PM, Carter Bullard <carter at qosient.com>
>>> wrote:
>>>>
>>>> Hey Mark,
>>>> There is no fixed schema, you generate whatever schema you wish.
>>>> So to do something like your tables, you would use something like:
>>>>   rasqlinsert -r file -w mysql://localhost/db/argus -s srcid  
>>>> stime dur \
>>>>        saddr daddr proto ......
>>>> because there is no key, rasqlinsert will just append to the table.
>>>> If you want an auto incrementing identifier for the row, you can
>>>> add "autoid" to the -s list and rasqlinsert will create an autoid  
>>>> column.
>>>> I just ran this here:
>>>>    ../bin/rasqlinsert -S localhost -w
>>>> mysql://root@localhost/ratop/etherHosts \
>>>>                 -M rmon nodrop cache -m srcid smac \
>>>>                 -s stime dur srcid smac spkts dpkts sbytes dbytes  
>>>> state
>>>> and it created this table in the ratop database (which it also  
>>>> created):
>>>> mysql> desc etherHosts;
>>>> +--------+-----------------------+------+-----+---------+-------+
>>>> | Field  | Type                  | Null | Key | Default | Extra |
>>>> +--------+-----------------------+------+-----+---------+-------+
>>>> | stime  | double(18,6) unsigned | NO   |     | NULL    |       |
>>>> | dur    | double(18,6)          | NO   |     | NULL    |       |
>>>> | srcid  | varchar(64)           | NO   | PRI |         |       |
>>>> | smac   | varchar(24)           | NO   | PRI |         |       |
>>>> | spkts  | bigint(20)            | YES  |     | NULL    |       |
>>>> | dpkts  | bigint(20)            | YES  |     | NULL    |       |
>>>> | sbytes | bigint(20)            | YES  |     | NULL    |       |
>>>> | dbytes | bigint(20)            | YES  |     | NULL    |       |
>>>> | state  | varchar(32)           | YES  |     | NULL    |       |
>>>> | record | blob                  | YES  |     | NULL    |       |
>>>> +--------+-----------------------+------+-----+---------+-------+
>>>> 10 rows in set (0.00 sec)
>>>> the record is the binary argus record that holds the actual data.
>>>>
>>>> Not sure about your partitioning,  so I'm thinking something like
>>>> this could work:
>>>>   rasqlinsert -r file -w mysql://localhost/db/argus.%Y.%m.%d
>>>> which would automatically create tablenames  using the timestamp
>>>> in the argus data.  As the data comes in, rasqlinsert would create
>>>> the table needed to hold the data, based on the strftime() format
>>>> you provide (just like rasplit).
>>>> Does that buy us anything>?
>>>> Carter
>>>>
>>>>
>>
>>
>>
>>
>>
>

Carter Bullard
CEO/President
QoSient, LLC
150 E 57th Street Suite 12D
New York, New York  10022

+1 212 588-9133 Phone
+1 212 588-9134 Fax