Fwd: updates for argus-2.x compatbility and database support

Thu Feb 26 11:30:30 EST 2009

I think the safest bet would be to use your tool for the inserts and
if someone wants to use Partitioning we create a quick 'how to' doc,
which I'd be happy to draw up when the time is right....

Cheers..

On Thu, Feb 26, 2009 at 11:27 AM, Carter Bullard <carter at qosient.com> wrote:
> Hey Mark,
> Well, it looks pretty simple to add a partition directive at table creation,
> but since its easy to do it on the fly, we may not even need to do
> it in rasqlinsert() ;o)
>
> If we come up with a strategy, I can add any kind of string to the table
> creation command, so....., its just a matter of finding out what will be
> a good thing to do.
>
> Carter
>
> On Feb 26, 2009, at 11:03 AM, Mark Bartlett wrote:
>
>> Didn't copy the distro...
>>
>>
>> ---------- Forwarded message ----------
>> From: Mark Bartlett <mabartle at gmail.com>
>> Date: Thu, Feb 26, 2009 at 11:01 AM
>> Subject: Re: [ARGUS] updates for argus-2.x compatbility and database
>> support
>> To: Carter Bullard <carter at qosient.com>
>>
>>
>> Glad to hear that I'm not completely off :^P
>>
>> We hold 'all' argus files on the LOADER, so we were thinking about
>> 'rerunning' all files through the ra tools to 'combine' the streams
>> sometime during the day and update the db, etc...
>>
>> As for 'building/dropping/adding' Partitions in mysql, it can be done
>> on the fly like so:
>>
>> ALTER TABLE t1 ADD PARTITION (PARTITION p3 VALUES LESS THAN (2002));
>>
>> You can also drop or archive a Partition, which is nice...
>>
>> Another benefit of using Partitions is that you can set the 'file
>> location' of the partition... So with my Daily Part. Schema above I
>> could have today on one Hard Drive and yesterdays partition on
>> another, to save my read/write time on the disk when performing
>> queries, etc... That was the idea behind our Daily and Hourly
>> Databases, I would have the Daily DB running on a separate server than
>> the Hourly.. Then if we were seeing a HUGE data increase into the db
>> we would start using the separating partitions to their own hard drive
>> strategy...  But we're not there yet... We have 2 'probes' collecting
>> and see 2 million events a day (130K/HR)...
>>
>> Cheers, and thanks for the helpful info..
>>
>> mark
>>
>> On Thu, Feb 26, 2009 at 10:37 AM, Carter Bullard <carter at qosient.com>
>> wrote:
>>>
>>> Hey Mark,
>>> Best to keep these discussions on the mailing list.
>>>
>>> I think there are a number of different strategies that will need to be
>>> thought about, and using the database to its fullest is really the key.
>>> Do we use mysql partitioning?  I say YES!!!!! especially to support
>>> federated distributed database tables.  I think that is very important.
>>> How we do it, though, will take some discussion.
>>>
>>> So putting the data into separate tables is A form of partitioning
>>> (i'm starting to read about db partitioning, and it may take a little
>>> time for me to come to speed).  This mysql specific partitioning
>>> strategy, seems to have its own limits in terms of configuration.
>>> How do I extend the database partition description for another
>>> month, as time is not going to wait for me ;o)   I'm not sure that I
>>> can do that on the fly?  You seem to have monthly tables, that are
>>> partitioned by day?
>>>
>>> The best question on your archive and collection strategy is,
>>> "does it work?" and  if so, then you're doing great!!!!   The tool
>>> specifically
>>> designed to generate your compressed periodic files is rastream().
>>> I would hold off on shifting to that, though until this new release of
>>> client software, so you're doing fine.
>>>
>>> You are not, however aggregating any records before you put
>>> them in the database, so you are storing the "primitive" data into
>>> mysql.  That is fine, but there is other processing you can do to
>>> make the data a bit easier to chew, so to speak.  BUT you need
>>> to keep the primitive data around somewhere for a while, as that
>>> is eventually what is needed to investigate the really bad cases.
>>>
>>> rasqlinsert(), only creates the database or table if they are
>>> needed.  If the table already exists, regardless of how it was
>>> generated, rasqlinsert() just pokes data into it.
>>>
>>>
>>> Carter
>>>
>>> On Feb 26, 2009, at 8:44 AM, Mark Bartlett wrote:
>>>
>>>> Looks good.. Only problem I would have is this:
>>>>
>>>> You create a different table for 'everyday'...  I cannot 'query' the
>>>> database without hitting every single argus.%Y.%m.%d table
>>>> individually..  If we us the "Partitioning option then it takes care
>>>> of the 'date' stuff for us..  So I can write a query that says
>>>> something like, select saddr, daddr, sum(bytes) from argus_tbl where
>>>> (saddr = 'a source IP') group by saddr, daddr; and it would return
>>>> that data, under your 'schema' I would have to write it like this:
>>>> select saddr, daddr, sum(bytes) from argus_tbl.(DATE) where (saddr =
>>>> 'a source IP') group by saddr, daddr; And I would have to repeat that
>>>> query for 'every' date table that is created... (argus_tbl.2009-02-24,
>>>> argus_tbl.2009-02-23)...  Which would make it difficult for me to
>>>> query over the last week of data to 'see' what a specific SADDR did.
>>>> (who they 'talk to', what protocols 'they' use, etc)..  Essentially I
>>>> would like to be able to 'baseline' my network using the Argus data,
>>>> and have my Analysts be able to 'see' patterns from graphs using the
>>>> argus data..
>>>>
>>>> Thoughts??
>>>>
>>>> There is def. nothing wrong with your rasqlinsert tool, I could just
>>>> write my database 'schema' using a partition and my problem is solved.
>>>> On that note, is there any 'flag' that can be set that will NOT create
>>>> the table, just insert?
>>>>
>>>> [snip]
>>>>
>>>>
>>>> On Wed, Feb 25, 2009 at 9:33 PM, Carter Bullard <carter at qosient.com>
>>>> wrote:
>>>>>
>>>>> Hey Mark,
>>>>> There is no fixed schema, you generate whatever schema you wish.
>>>>> So to do something like your tables, you would use something like:
>>>>>  rasqlinsert -r file -w mysql://localhost/db/argus -s srcid stime dur \
>>>>>       saddr daddr proto ......
>>>>> because there is no key, rasqlinsert will just append to the table.
>>>>> If you want an auto incrementing identifier for the row, you can
>>>>> add "autoid" to the -s list and rasqlinsert will create an autoid
>>>>> column.
>>>>> I just ran this here:
>>>>>   ../bin/rasqlinsert -S localhost -w
>>>>> mysql://root@localhost/ratop/etherHosts \
>>>>>                -M rmon nodrop cache -m srcid smac \
>>>>>                -s stime dur srcid smac spkts dpkts sbytes dbytes state
>>>>> and it created this table in the ratop database (which it also
>>>>> created):
>>>>> mysql> desc etherHosts;
>>>>> +--------+-----------------------+------+-----+---------+-------+
>>>>> | Field  | Type                  | Null | Key | Default | Extra |
>>>>> +--------+-----------------------+------+-----+---------+-------+
>>>>> | stime  | double(18,6) unsigned | NO   |     | NULL    |       |
>>>>> | dur    | double(18,6)          | NO   |     | NULL    |       |
>>>>> | srcid  | varchar(64)           | NO   | PRI |         |       |
>>>>> | smac   | varchar(24)           | NO   | PRI |         |       |
>>>>> | spkts  | bigint(20)            | YES  |     | NULL    |       |
>>>>> | dpkts  | bigint(20)            | YES  |     | NULL    |       |
>>>>> | sbytes | bigint(20)            | YES  |     | NULL    |       |
>>>>> | dbytes | bigint(20)            | YES  |     | NULL    |       |
>>>>> | state  | varchar(32)           | YES  |     | NULL    |       |
>>>>> | record | blob                  | YES  |     | NULL    |       |
>>>>> +--------+-----------------------+------+-----+---------+-------+
>>>>> 10 rows in set (0.00 sec)
>>>>> the record is the binary argus record that holds the actual data.
>>>>>
>>>>> Not sure about your partitioning,  so I'm thinking something like
>>>>> this could work:
>>>>>  rasqlinsert -r file -w mysql://localhost/db/argus.%Y.%m.%d
>>>>> which would automatically create tablenames  using the timestamp
>>>>> in the argus data.  As the data comes in, rasqlinsert would create
>>>>> the table needed to hold the data, based on the strftime() format
>>>>> you provide (just like rasplit).
>>>>> Does that buy us anything>?
>>>>> Carter
>>>>>
>>>>>
>>>
>>>
>>>
>>>
>>>
>>
>
> Carter Bullard
> CEO/President
> QoSient, LLC
> 150 E 57th Street Suite 12D
> New York, New York  10022
>
> +1 212 588-9133 Phone
> +1 212 588-9134 Fax
>
>
>
>