Help with rasql queries

Carter Bullard carter at qosient.com
Tue Sep 10 13:10:18 EDT 2013


Hey Jesse,
Looks like mysql() takes over 1 minute do a "count(*) where daddr=" query, against 20M entries.  That looks like a terrible number to me.
Is there anything in your mysql logs that indicate issues / errors ????

Carter


On Sep 10, 2013, at 10:26 AM, Jesse Bowling <jessebowling at gmail.com> wrote:

> Bringing this one back after a long hiatus...
> 
> I've moved to using argus-clients-3.0.7.15 and am testing out using rasqlinsert/rasql again. I'm still seeing it take a pretty long time to insert and query, and am looking for advice from the list on where to check for performance bottlenecks.
> 
> I inserted a day's data with:
> 
> /usr/local/src/argus-clients-3.0.7.15/bin/rasqlinsert -m none -s saddr daddr stime ltime -R /usr/local/logs/argus/2013/08/13/ -w mysql://argus@localhost/argus/aip_%Y_%m_%d -M time 1d drop
> 
> This took a little over 1 hour to insert (seems long?).
> 
> Then the queries work fine, but take longer than I'd expect (although much faster than using argus to comb the data manually):
> 
> # /usr/bin/time rasql -D 3 -r mysql://argus@localhost/argus/aip_%Y_%m_%d -M time 1d sql="daddr='8.8.8.8'" -t 2013/08/13 -w - |ra -r - |wc -l
> rasql[6473.c057819d6c7f0000]: 09/10/13 10:16:39.374263 ArgusParseTime (0x7f6c9d771010, 0x7f6c9d771158, 0x7f6c9d771190, "2013/08/13", ' ', 0.000000) retn day(1376366400)
> rasql[6473.c057819d6c7f0000]: 09/10/13 10:16:39.374341 ArgusCheckTimeFormat (0x7f6c9d7711c8, 2013/08/13) 1376366400.000000-1376452800.000000
> rasql[6473.c057819d6c7f0000]: 09/10/13 10:16:39.374391 ArgusParseInit(0x7f6c9d771010, NULL)
> rasql[6473.c057819d6c7f0000]: 09/10/13 10:16:39.406431 RaMySQLInit () RaSource (null) RaArchive (null) RaFormat (null)
> rasql[6473.c057819d6c7f0000]: 09/10/13 10:16:39.408899 SQL Query SELECT record from aip_2013_08_13 WHERE daddr='8.8.8.8'
> rasql[6473.c057819d6c7f0000]: 09/10/13 10:19:05.262465 ArgusShutDown (2)
> 9.40user 0.77system 2:25.95elapsed 6%CPU (0avgtext+0avgdata 701504maxresident)k
> 1992inputs+0outputs (14major+43908minor)pagefaults 0swaps
> 1728691
> 
> mysql> describe aip_2013_08_13;
> +--------+-----------------------+------+-----+---------+-------+
> | Field  | Type                  | Null | Key | Default | Extra |
> +--------+-----------------------+------+-----+---------+-------+
> | saddr  | varchar(64)           | NO   |     | NULL    |       |
> | daddr  | varchar(64)           | NO   |     | NULL    |       |
> | stime  | double(18,6) unsigned | NO   |     | NULL    |       |
> | ltime  | double(18,6) unsigned | NO   |     | NULL    |       |
> | record | blob                  | YES  |     | NULL    |       |
> +--------+-----------------------+------+-----+---------+-------+
> 
> mysql> select count(*) from aip_2013_08_13;
> +----------+
> | count(*) |
> +----------+
> | 27896728 |
> +----------+
> 1 row in set (0.00 sec)
> 
> mysql> select count(*) from aip_2013_08_13 where daddr='8.8.8.8';
> +----------+
> | count(*) |
> +----------+
> |  1728690 |
> +----------+
> 1 row in set (1 min 5.82 sec)
> 
> This is a 4-core VM with FC disk and 8 GB of RAM...Sar says:
> 08:30:01 AM     CPU     %user     %nice   %system   %iowait    %steal     %idle
> 10:00:01 AM     all     25.68      0.00      6.12      2.40      0.00     65.81
> 10:10:01 AM     all     27.06      0.00      6.35      2.01      0.00     64.58
> 10:20:01 AM     all     30.50      0.00      8.62      3.56      0.00     57.32
> Average:        all     30.34      0.01      6.04      2.77      0.00     60.84
> 
> So there's a bit of iowait while this is running, but not too bad...Any hints on where I should look?
> 
> Cheers,
> 
> Jesse
> 
> 
> On Sat, May 4, 2013 at 10:14 AM, Carter Bullard <carter at qosient.com> wrote:
>> Hey Jesse,
>> Excellent, however 2m:38s is too long for me !!!
>> So is that the rasql() query time, or is that the total time for insertion, and query?
>> Hope all is most excellent !!!!
>> Carter
>> 
>> Carter Bullard, QoSient, LLC
>> 150 E. 57th Street Suite 12D
>> New York, New York 10022
>> +1 212 588-9133 Phone
>> +1 212 588-9134 Fax
>> 
>> On May 3, 2013, at 10:05 PM, Jesse Bowling <jessebowling at gmail.com> wrote:
>> 
>>> Hi Carter,
>>> 
>>> Well that's just sillly. I copy pasted from example text and didn't notice that it was using backticks rather than single quotes. After correcting the quotes it's all good...
>>> 
>>> 2m:38s to check 2.5 million IP's over three days...Not bad! :)
>>> 
>>> Cheers,
>>> 
>>> Jesse
>>> 
>>> 
>>> On Fri, May 3, 2013 at 9:02 PM, Carter Bullard <carter at qosient.com> wrote:
>>>> Hey Jesse,
>>>> Use mysql() to try the same queries rasql() is using, to see why they aren't giving you a return.  
>>>> 
>>>>> rasql[19804.c0272312b07f0000]: 05/03/13 19:49:31.029297 SQL Query SELECT record from ipAddrs_2013_04_29 WHERE saddr=’66.249.76.186’
>>>> 
>>>> Try that literally, but change record to something like saddr.
>>>> 
>>>> Carter
>>>> 
>>>> 
>>>> On May 3, 2013, at 7:52 PM, Jesse Bowling <jessebowling at gmail.com> wrote:
>>>> 
>>>>> Thanks Carter; I missed the configure step after making .debug...
>>>>> 
>>>>> # rasql -t -7d+7d -M time 1d -M sql="saddr=’66.249.76.186’" -r mysql://user@host/argus_ip/ipAddrs_%Y_%m_%d -D3
>>>>> rasql[19804.c0272312b07f0000]: 05/03/13 19:49:30.971502 ArgusParseInit(0x120ee010, NULL)
>>>>> rasql[19804.c0272312b07f0000]: 05/03/13 19:49:31.009019 RaMySQLInit () RaSource (null) RaArchive (null) RaFormat (null)
>>>>> rasql[19804.c0272312b07f0000]: 05/03/13 19:49:31.019209 SQL Query SELECT record from ipAddrs_2013_04_27 WHERE saddr=’66.249.76.186’
>>>>> rasql[19804.c0272312b07f0000]: 05/03/13 19:49:31.023303 SQL Query SELECT record from ipAddrs_2013_04_28 WHERE saddr=’66.249.76.186’
>>>>> rasql[19804.c0272312b07f0000]: 05/03/13 19:49:31.029297 SQL Query SELECT record from ipAddrs_2013_04_29 WHERE saddr=’66.249.76.186’
>>>>> rasql[19804.c0272312b07f0000]: 05/03/13 19:49:31.032263 SQL Query SELECT record from ipAddrs_2013_04_30 WHERE saddr=’66.249.76.186’
>>>>> rasql[19804.c0272312b07f0000]: 05/03/13 19:49:31.036089 SQL Query SELECT record from ipAddrs_2013_05_01 WHERE saddr=’66.249.76.186’
>>>>> rasql[19804.c0272312b07f0000]: 05/03/13 19:49:31.039221 SQL Query SELECT record from ipAddrs_2013_05_02 WHERE saddr=’66.249.76.186’
>>>>> rasql[19804.c0272312b07f0000]: 05/03/13 19:49:31.043198 SQL Query SELECT record from ipAddrs_2013_05_03 WHERE saddr=’66.249.76.186’
>>>>> rasql[19804.c0272312b07f0000]: 05/03/13 19:49:31.048043 ArgusShutDown (2)
>>>>> 
>>>>> What next? I see that 3.0.7.7 is available; should I upgrade to those?
>>>>> 
>>>>> Thanks for the assistance,
>>>>> 
>>>>> Jesse
>>>>> 
>>>>> 
>>>>> On Fri, May 3, 2013 at 7:35 PM, Carter Bullard <carter at qosient.com> wrote:
>>>>>> Hey Jesse,
>>>>>> You have to compile in the debug support....in the clients root directory.
>>>>>>   % touch .debug
>>>>>>   % ./configure
>>>>>>   % make clean; make
>>>>>> 
>>>>>> Carter
>>>>>> 
>>>>>> Carter Bullard, QoSient, LLC
>>>>>> 150 E. 57th Street Suite 12D
>>>>>> New York, New York 10022
>>>>>> +1 212 588-9133 Phone
>>>>>> +1 212 588-9134 Fax
>>>>>> 
>>>>>> On May 3, 2013, at 4:03 PM, Jesse Bowling <jessebowling at gmail.com> wrote:
>>>>>> 
>>>>>>> On Fri, May 3, 2013 at 2:23 PM, Carter Bullard <carter at qosient.com> wrote:
>>>>>>>> Hey Jesse,
>>>>>>>> Your tables took too long to build, What version are you using ???
>>>>>>> 
>>>>>>> RaSqlInsert Version 3.0.7.3
>>>>>>> RaSql Version 3.0.7.3
>>>>>>>  
>>>>>>>> I suspect, because you used the -R option, that you tickled a bug
>>>>>>>> where we end up using huge amounts of memory, and you started
>>>>>>>> to page.  The bug was the -R option seemed to not want to deallocate
>>>>>>>> the primitive flow records between files, but the -r worked fine. That
>>>>>>>> bug should have been fixed.
>>>>>>>> 
>>>>>>>> If you had used:
>>>>>>>> > rasqlinsert -M time 1d -M rmon -m saddr -r 2013/04/29/* -r 2013/04/30/* -w mysql://user@host/argus_ip/ipAddrs_%Y_%m_%d - ip
>>>>>>>> 
>>>>>>>> I believe it would have gone much quicker.  Could you give that a try ?
>>>>>>>> I removed the " -M cache " option, as that is really intended for real time data
>>>>>>>> processing, using the " -S argus.source " rather than files.
>>>>>>> I'll give that a shot with some new data and let you know how that goes...It should be noted that this is an over-the-wire insert; thus network comes into play... :)
>>>>>>>  
>>>>>>>> OK, to best understand what a particular query is doing, use the " -D3 "
>>>>>>>> debug option.  It should printout the actual mysql calls, so you can figure out
>>>>>>>> what is up.
>>>>>>>> 
>>>>>>>> Try your query:
>>>>>>>>    rasql -t -7d+7d -M time 1d -M sql="saddr=’66.249.76.186’" -r mysql://user@host/argus_ip/ipAddrs_%Y_%m_%d -D3
>>>>>>>> 
>>>>>>>> What did you get?
>>>>>>> 
>>>>>>> Unfortunately I get....nothing. The prompt simply returns with no messages. I also tried with -D9 with the same results...
>>>>>>>  
>>>>>>> I ensured that there is a .debug file in the source directory and re-comiled, but it's still not giving a response...
>>>>>>>  
>>>>>>>> Carter
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On May 3, 2013, at 12:16 PM, Jesse Bowling <jessebowling at gmail.com> wrote:
>>>>>>>> 
>>>>>>>> > Hi,
>>>>>>>> >
>>>>>>>> > Again I'm trying to follow through some of the APT examples. I'm very interested in the IP address SQL backing tables to speed searches, as this is a common search for my purposes.
>>>>>>>> >
>>>>>>>> > I've built tables with two days of data like:
>>>>>>>> >
>>>>>>>> > rasqlinsert -M time 1d -M rmon -m saddr -R 2013/04/29 -R 2013/04/30 -w mysql://user@host/argus_ip/ipAddrs_%Y_%m_%d -M cache - ip
>>>>>>>> >
>>>>>>>> > This took a while (29 hours) but did build tables:
>>>>>>>> >
>>>>>>>> > mysql> use argus_ip;
>>>>>>>> > Database changed
>>>>>>>> > mysql> show tables;
>>>>>>>> > +--------------------+
>>>>>>>> > | Tables_in_argus_ip |
>>>>>>>> > +--------------------+
>>>>>>>> > | ipAddrs_2013_04_29 |
>>>>>>>> > | ipAddrs_2013_04_30 |
>>>>>>>> > +--------------------+
>>>>>>>> > 2 rows in set (0.00 sec)
>>>>>>>> >
>>>>>>>> > mysql> describe ipAddrs_2013_04_29;
>>>>>>>> > +--------+-----------------------+------+-----+---------+-------+
>>>>>>>> > | Field  | Type                  | Null | Key | Default | Extra |
>>>>>>>> > +--------+-----------------------+------+-----+---------+-------+
>>>>>>>> > | stime  | double(18,6) unsigned | NO   |     | NULL    |       |
>>>>>>>> > | flgs   | varchar(32)           | YES  |     | NULL    |       |
>>>>>>>> > | proto  | varchar(16)           | NO   |     | NULL    |       |
>>>>>>>> > | saddr  | varchar(64)           | NO   | PRI | NULL    |       |
>>>>>>>> > | sport  | varchar(10)           | NO   |     | NULL    |       |
>>>>>>>> > | dir    | varchar(3)            | YES  |     | NULL    |       |
>>>>>>>> > | daddr  | varchar(64)           | NO   |     | NULL    |       |
>>>>>>>> > | dport  | varchar(10)           | NO   |     | NULL    |       |
>>>>>>>> > | pkts   | bigint(20)            | YES  |     | NULL    |       |
>>>>>>>> > | bytes  | bigint(20)            | YES  |     | NULL    |       |
>>>>>>>> > | state  | varchar(32)           | YES  |     | NULL    |       |
>>>>>>>> > | record | blob                  | YES  |     | NULL    |       |
>>>>>>>> > +--------+-----------------------+------+-----+---------+-------+
>>>>>>>> > 12 rows in set (0.00 sec)
>>>>>>>> >
>>>>>>>> > mysql> select count(*) from ipAddrs_2013_04_29;
>>>>>>>> > +----------+
>>>>>>>> > | count(*) |
>>>>>>>> > +----------+
>>>>>>>> > |   715059 |
>>>>>>>> > +----------+
>>>>>>>> > 1 row in set (0.02 sec)
>>>>>>>> >
>>>>>>>> > mysql> select count(*) from ipAddrs_2013_04_30;
>>>>>>>> > +----------+
>>>>>>>> > | count(*) |
>>>>>>>> > +----------+
>>>>>>>> > |  1362479 |
>>>>>>>> > +----------+
>>>>>>>> > 1 row in set (0.00 sec)
>>>>>>>> >
>>>>>>>> > mysql> select stime,flgs,proto,saddr,sport,dir,daddr,dport,pkts,bytes,state from ipAddrs_2013_04_29 limit 1;
>>>>>>>> > +-------------------+------+-------+---------------+-------+------+---------+-------+------+---------+-------+
>>>>>>>> > | stime             | flgs | proto | saddr         | sport | dir  | daddr   | dport | pkts | bytes   | state |
>>>>>>>> > +-------------------+------+-------+---------------+-------+------+---------+-------+------+---------+-------+
>>>>>>>> > | 1367208000.000000 |      | ip    | 66.249.76.186 |       | <->  | 0.0.0.0 |       | 2465 | 1964858 | CON   |
>>>>>>>> > +-------------------+------+-------+---------------+-------+------+---------+-------+------+---------+-------+
>>>>>>>> > 1 row in set (0.00 sec)
>>>>>>>> >
>>>>>>>> > ...but now I can't seem to get any data back out using the incantations I've seen:
>>>>>>>> >
>>>>>>>> > # time rasql -t -7d+7d -M time 1d -M sql="saddr=’66.249.76.186’" -r mysql://user@host/argus_ip/ipAddrs_%Y_%m_%d
>>>>>>>> >
>>>>>>>> > real    0m0.073s
>>>>>>>> > user    0m0.008s
>>>>>>>> > sys     0m0.021s
>>>>>>>> >
>>>>>>>> > # time rasql -t 2013/04/29 -M time 1d -M sql="saddr=’66.249.76.186’" -r mysql://user@host/argus_ip/ipAddrs_%Y_%m_%d
>>>>>>>> >
>>>>>>>> > real    0m0.045s
>>>>>>>> > user    0m0.010s
>>>>>>>> > sys     0m0.017s
>>>>>>>> >
>>>>>>>> > This does work, but much more slowly that I expected:
>>>>>>>> >
>>>>>>>> > # time rasql -t 2013/04/29 -M time 1d -r mysql://user@host/argus_ip/ipAddrs_%Y_%m_%d - host 66.249.76.186
>>>>>>>> >                      StartTime      Flgs  Proto            SrcAddr  Sport   Dir            DstAddr  Dport  TotPkts   TotBytes State
>>>>>>>> >       04/29/13 00:00:00.000000                0      66.249.76.186          <->            0.0.0.0            2465    1964858   CON
>>>>>>>> >
>>>>>>>> > real    0m46.419s
>>>>>>>> > user    0m4.245s
>>>>>>>> > sys     0m0.991s
>>>>>>>> >
>>>>>>>> > # time rasql -t 2013/04/29+1d -M time 1d -r mysql://user@host/argus_ip/ipAddrs_%Y_%m_%d - host 66.249.76.186
>>>>>>>> > ^C
>>>>>>>> > real    26m18.117s
>>>>>>>> > user    0m0.037s
>>>>>>>> > sys     0m0.165s
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > I'm guessing that the second form (using a filter to rasql) is pulling back ALL records from mySQL and filtering that, whereas the first format should be using mySQL to do the filtering (the preferred method). So I suppose my questions are "Is this last statement correct?" and "What am I doing wrong with the first form of query?"
>>>>>>>> >
>>>>>>>> > Is the query speed for the working format inline with expectations? Seems a bit slow...
>>>>>>>> >
>>>>>>>> > Cheers,
>>>>>>>> >
>>>>>>>> > Jesse
>>>>>>>> > --
>>>>>>>> > Jesse Bowling
>>>>>>>> >
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> -- 
>>>>>>> Jesse Bowling
>>>>> 
>>>>> 
>>>>> 
>>>>> -- 
>>>>> Jesse Bowling
>>> 
>>> 
>>> 
>>> -- 
>>> Jesse Bowling
> 
> 
> 
> -- 
> Jesse Bowling
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20130910/14526402/attachment.html>


More information about the argus mailing list