Help with rasql queries
Carter Bullard
carter at qosient.com
Tue Sep 10 13:10:18 EDT 2013
Hey Jesse,
Looks like mysql() takes over 1 minute do a "count(*) where daddr=" query, against 20M entries. That looks like a terrible number to me.
Is there anything in your mysql logs that indicate issues / errors ????
Carter
On Sep 10, 2013, at 10:26 AM, Jesse Bowling <jessebowling at gmail.com> wrote:
> Bringing this one back after a long hiatus...
>
> I've moved to using argus-clients-3.0.7.15 and am testing out using rasqlinsert/rasql again. I'm still seeing it take a pretty long time to insert and query, and am looking for advice from the list on where to check for performance bottlenecks.
>
> I inserted a day's data with:
>
> /usr/local/src/argus-clients-3.0.7.15/bin/rasqlinsert -m none -s saddr daddr stime ltime -R /usr/local/logs/argus/2013/08/13/ -w mysql://argus@localhost/argus/aip_%Y_%m_%d -M time 1d drop
>
> This took a little over 1 hour to insert (seems long?).
>
> Then the queries work fine, but take longer than I'd expect (although much faster than using argus to comb the data manually):
>
> # /usr/bin/time rasql -D 3 -r mysql://argus@localhost/argus/aip_%Y_%m_%d -M time 1d sql="daddr='8.8.8.8'" -t 2013/08/13 -w - |ra -r - |wc -l
> rasql[6473.c057819d6c7f0000]: 09/10/13 10:16:39.374263 ArgusParseTime (0x7f6c9d771010, 0x7f6c9d771158, 0x7f6c9d771190, "2013/08/13", ' ', 0.000000) retn day(1376366400)
> rasql[6473.c057819d6c7f0000]: 09/10/13 10:16:39.374341 ArgusCheckTimeFormat (0x7f6c9d7711c8, 2013/08/13) 1376366400.000000-1376452800.000000
> rasql[6473.c057819d6c7f0000]: 09/10/13 10:16:39.374391 ArgusParseInit(0x7f6c9d771010, NULL)
> rasql[6473.c057819d6c7f0000]: 09/10/13 10:16:39.406431 RaMySQLInit () RaSource (null) RaArchive (null) RaFormat (null)
> rasql[6473.c057819d6c7f0000]: 09/10/13 10:16:39.408899 SQL Query SELECT record from aip_2013_08_13 WHERE daddr='8.8.8.8'
> rasql[6473.c057819d6c7f0000]: 09/10/13 10:19:05.262465 ArgusShutDown (2)
> 9.40user 0.77system 2:25.95elapsed 6%CPU (0avgtext+0avgdata 701504maxresident)k
> 1992inputs+0outputs (14major+43908minor)pagefaults 0swaps
> 1728691
>
> mysql> describe aip_2013_08_13;
> +--------+-----------------------+------+-----+---------+-------+
> | Field | Type | Null | Key | Default | Extra |
> +--------+-----------------------+------+-----+---------+-------+
> | saddr | varchar(64) | NO | | NULL | |
> | daddr | varchar(64) | NO | | NULL | |
> | stime | double(18,6) unsigned | NO | | NULL | |
> | ltime | double(18,6) unsigned | NO | | NULL | |
> | record | blob | YES | | NULL | |
> +--------+-----------------------+------+-----+---------+-------+
>
> mysql> select count(*) from aip_2013_08_13;
> +----------+
> | count(*) |
> +----------+
> | 27896728 |
> +----------+
> 1 row in set (0.00 sec)
>
> mysql> select count(*) from aip_2013_08_13 where daddr='8.8.8.8';
> +----------+
> | count(*) |
> +----------+
> | 1728690 |
> +----------+
> 1 row in set (1 min 5.82 sec)
>
> This is a 4-core VM with FC disk and 8 GB of RAM...Sar says:
> 08:30:01 AM CPU %user %nice %system %iowait %steal %idle
> 10:00:01 AM all 25.68 0.00 6.12 2.40 0.00 65.81
> 10:10:01 AM all 27.06 0.00 6.35 2.01 0.00 64.58
> 10:20:01 AM all 30.50 0.00 8.62 3.56 0.00 57.32
> Average: all 30.34 0.01 6.04 2.77 0.00 60.84
>
> So there's a bit of iowait while this is running, but not too bad...Any hints on where I should look?
>
> Cheers,
>
> Jesse
>
>
> On Sat, May 4, 2013 at 10:14 AM, Carter Bullard <carter at qosient.com> wrote:
>> Hey Jesse,
>> Excellent, however 2m:38s is too long for me !!!
>> So is that the rasql() query time, or is that the total time for insertion, and query?
>> Hope all is most excellent !!!!
>> Carter
>>
>> Carter Bullard, QoSient, LLC
>> 150 E. 57th Street Suite 12D
>> New York, New York 10022
>> +1 212 588-9133 Phone
>> +1 212 588-9134 Fax
>>
>> On May 3, 2013, at 10:05 PM, Jesse Bowling <jessebowling at gmail.com> wrote:
>>
>>> Hi Carter,
>>>
>>> Well that's just sillly. I copy pasted from example text and didn't notice that it was using backticks rather than single quotes. After correcting the quotes it's all good...
>>>
>>> 2m:38s to check 2.5 million IP's over three days...Not bad! :)
>>>
>>> Cheers,
>>>
>>> Jesse
>>>
>>>
>>> On Fri, May 3, 2013 at 9:02 PM, Carter Bullard <carter at qosient.com> wrote:
>>>> Hey Jesse,
>>>> Use mysql() to try the same queries rasql() is using, to see why they aren't giving you a return.
>>>>
>>>>> rasql[19804.c0272312b07f0000]: 05/03/13 19:49:31.029297 SQL Query SELECT record from ipAddrs_2013_04_29 WHERE saddr=’66.249.76.186’
>>>>
>>>> Try that literally, but change record to something like saddr.
>>>>
>>>> Carter
>>>>
>>>>
>>>> On May 3, 2013, at 7:52 PM, Jesse Bowling <jessebowling at gmail.com> wrote:
>>>>
>>>>> Thanks Carter; I missed the configure step after making .debug...
>>>>>
>>>>> # rasql -t -7d+7d -M time 1d -M sql="saddr=’66.249.76.186’" -r mysql://user@host/argus_ip/ipAddrs_%Y_%m_%d -D3
>>>>> rasql[19804.c0272312b07f0000]: 05/03/13 19:49:30.971502 ArgusParseInit(0x120ee010, NULL)
>>>>> rasql[19804.c0272312b07f0000]: 05/03/13 19:49:31.009019 RaMySQLInit () RaSource (null) RaArchive (null) RaFormat (null)
>>>>> rasql[19804.c0272312b07f0000]: 05/03/13 19:49:31.019209 SQL Query SELECT record from ipAddrs_2013_04_27 WHERE saddr=’66.249.76.186’
>>>>> rasql[19804.c0272312b07f0000]: 05/03/13 19:49:31.023303 SQL Query SELECT record from ipAddrs_2013_04_28 WHERE saddr=’66.249.76.186’
>>>>> rasql[19804.c0272312b07f0000]: 05/03/13 19:49:31.029297 SQL Query SELECT record from ipAddrs_2013_04_29 WHERE saddr=’66.249.76.186’
>>>>> rasql[19804.c0272312b07f0000]: 05/03/13 19:49:31.032263 SQL Query SELECT record from ipAddrs_2013_04_30 WHERE saddr=’66.249.76.186’
>>>>> rasql[19804.c0272312b07f0000]: 05/03/13 19:49:31.036089 SQL Query SELECT record from ipAddrs_2013_05_01 WHERE saddr=’66.249.76.186’
>>>>> rasql[19804.c0272312b07f0000]: 05/03/13 19:49:31.039221 SQL Query SELECT record from ipAddrs_2013_05_02 WHERE saddr=’66.249.76.186’
>>>>> rasql[19804.c0272312b07f0000]: 05/03/13 19:49:31.043198 SQL Query SELECT record from ipAddrs_2013_05_03 WHERE saddr=’66.249.76.186’
>>>>> rasql[19804.c0272312b07f0000]: 05/03/13 19:49:31.048043 ArgusShutDown (2)
>>>>>
>>>>> What next? I see that 3.0.7.7 is available; should I upgrade to those?
>>>>>
>>>>> Thanks for the assistance,
>>>>>
>>>>> Jesse
>>>>>
>>>>>
>>>>> On Fri, May 3, 2013 at 7:35 PM, Carter Bullard <carter at qosient.com> wrote:
>>>>>> Hey Jesse,
>>>>>> You have to compile in the debug support....in the clients root directory.
>>>>>> % touch .debug
>>>>>> % ./configure
>>>>>> % make clean; make
>>>>>>
>>>>>> Carter
>>>>>>
>>>>>> Carter Bullard, QoSient, LLC
>>>>>> 150 E. 57th Street Suite 12D
>>>>>> New York, New York 10022
>>>>>> +1 212 588-9133 Phone
>>>>>> +1 212 588-9134 Fax
>>>>>>
>>>>>> On May 3, 2013, at 4:03 PM, Jesse Bowling <jessebowling at gmail.com> wrote:
>>>>>>
>>>>>>> On Fri, May 3, 2013 at 2:23 PM, Carter Bullard <carter at qosient.com> wrote:
>>>>>>>> Hey Jesse,
>>>>>>>> Your tables took too long to build, What version are you using ???
>>>>>>>
>>>>>>> RaSqlInsert Version 3.0.7.3
>>>>>>> RaSql Version 3.0.7.3
>>>>>>>
>>>>>>>> I suspect, because you used the -R option, that you tickled a bug
>>>>>>>> where we end up using huge amounts of memory, and you started
>>>>>>>> to page. The bug was the -R option seemed to not want to deallocate
>>>>>>>> the primitive flow records between files, but the -r worked fine. That
>>>>>>>> bug should have been fixed.
>>>>>>>>
>>>>>>>> If you had used:
>>>>>>>> > rasqlinsert -M time 1d -M rmon -m saddr -r 2013/04/29/* -r 2013/04/30/* -w mysql://user@host/argus_ip/ipAddrs_%Y_%m_%d - ip
>>>>>>>>
>>>>>>>> I believe it would have gone much quicker. Could you give that a try ?
>>>>>>>> I removed the " -M cache " option, as that is really intended for real time data
>>>>>>>> processing, using the " -S argus.source " rather than files.
>>>>>>> I'll give that a shot with some new data and let you know how that goes...It should be noted that this is an over-the-wire insert; thus network comes into play... :)
>>>>>>>
>>>>>>>> OK, to best understand what a particular query is doing, use the " -D3 "
>>>>>>>> debug option. It should printout the actual mysql calls, so you can figure out
>>>>>>>> what is up.
>>>>>>>>
>>>>>>>> Try your query:
>>>>>>>> rasql -t -7d+7d -M time 1d -M sql="saddr=’66.249.76.186’" -r mysql://user@host/argus_ip/ipAddrs_%Y_%m_%d -D3
>>>>>>>>
>>>>>>>> What did you get?
>>>>>>>
>>>>>>> Unfortunately I get....nothing. The prompt simply returns with no messages. I also tried with -D9 with the same results...
>>>>>>>
>>>>>>> I ensured that there is a .debug file in the source directory and re-comiled, but it's still not giving a response...
>>>>>>>
>>>>>>>> Carter
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On May 3, 2013, at 12:16 PM, Jesse Bowling <jessebowling at gmail.com> wrote:
>>>>>>>>
>>>>>>>> > Hi,
>>>>>>>> >
>>>>>>>> > Again I'm trying to follow through some of the APT examples. I'm very interested in the IP address SQL backing tables to speed searches, as this is a common search for my purposes.
>>>>>>>> >
>>>>>>>> > I've built tables with two days of data like:
>>>>>>>> >
>>>>>>>> > rasqlinsert -M time 1d -M rmon -m saddr -R 2013/04/29 -R 2013/04/30 -w mysql://user@host/argus_ip/ipAddrs_%Y_%m_%d -M cache - ip
>>>>>>>> >
>>>>>>>> > This took a while (29 hours) but did build tables:
>>>>>>>> >
>>>>>>>> > mysql> use argus_ip;
>>>>>>>> > Database changed
>>>>>>>> > mysql> show tables;
>>>>>>>> > +--------------------+
>>>>>>>> > | Tables_in_argus_ip |
>>>>>>>> > +--------------------+
>>>>>>>> > | ipAddrs_2013_04_29 |
>>>>>>>> > | ipAddrs_2013_04_30 |
>>>>>>>> > +--------------------+
>>>>>>>> > 2 rows in set (0.00 sec)
>>>>>>>> >
>>>>>>>> > mysql> describe ipAddrs_2013_04_29;
>>>>>>>> > +--------+-----------------------+------+-----+---------+-------+
>>>>>>>> > | Field | Type | Null | Key | Default | Extra |
>>>>>>>> > +--------+-----------------------+------+-----+---------+-------+
>>>>>>>> > | stime | double(18,6) unsigned | NO | | NULL | |
>>>>>>>> > | flgs | varchar(32) | YES | | NULL | |
>>>>>>>> > | proto | varchar(16) | NO | | NULL | |
>>>>>>>> > | saddr | varchar(64) | NO | PRI | NULL | |
>>>>>>>> > | sport | varchar(10) | NO | | NULL | |
>>>>>>>> > | dir | varchar(3) | YES | | NULL | |
>>>>>>>> > | daddr | varchar(64) | NO | | NULL | |
>>>>>>>> > | dport | varchar(10) | NO | | NULL | |
>>>>>>>> > | pkts | bigint(20) | YES | | NULL | |
>>>>>>>> > | bytes | bigint(20) | YES | | NULL | |
>>>>>>>> > | state | varchar(32) | YES | | NULL | |
>>>>>>>> > | record | blob | YES | | NULL | |
>>>>>>>> > +--------+-----------------------+------+-----+---------+-------+
>>>>>>>> > 12 rows in set (0.00 sec)
>>>>>>>> >
>>>>>>>> > mysql> select count(*) from ipAddrs_2013_04_29;
>>>>>>>> > +----------+
>>>>>>>> > | count(*) |
>>>>>>>> > +----------+
>>>>>>>> > | 715059 |
>>>>>>>> > +----------+
>>>>>>>> > 1 row in set (0.02 sec)
>>>>>>>> >
>>>>>>>> > mysql> select count(*) from ipAddrs_2013_04_30;
>>>>>>>> > +----------+
>>>>>>>> > | count(*) |
>>>>>>>> > +----------+
>>>>>>>> > | 1362479 |
>>>>>>>> > +----------+
>>>>>>>> > 1 row in set (0.00 sec)
>>>>>>>> >
>>>>>>>> > mysql> select stime,flgs,proto,saddr,sport,dir,daddr,dport,pkts,bytes,state from ipAddrs_2013_04_29 limit 1;
>>>>>>>> > +-------------------+------+-------+---------------+-------+------+---------+-------+------+---------+-------+
>>>>>>>> > | stime | flgs | proto | saddr | sport | dir | daddr | dport | pkts | bytes | state |
>>>>>>>> > +-------------------+------+-------+---------------+-------+------+---------+-------+------+---------+-------+
>>>>>>>> > | 1367208000.000000 | | ip | 66.249.76.186 | | <-> | 0.0.0.0 | | 2465 | 1964858 | CON |
>>>>>>>> > +-------------------+------+-------+---------------+-------+------+---------+-------+------+---------+-------+
>>>>>>>> > 1 row in set (0.00 sec)
>>>>>>>> >
>>>>>>>> > ...but now I can't seem to get any data back out using the incantations I've seen:
>>>>>>>> >
>>>>>>>> > # time rasql -t -7d+7d -M time 1d -M sql="saddr=’66.249.76.186’" -r mysql://user@host/argus_ip/ipAddrs_%Y_%m_%d
>>>>>>>> >
>>>>>>>> > real 0m0.073s
>>>>>>>> > user 0m0.008s
>>>>>>>> > sys 0m0.021s
>>>>>>>> >
>>>>>>>> > # time rasql -t 2013/04/29 -M time 1d -M sql="saddr=’66.249.76.186’" -r mysql://user@host/argus_ip/ipAddrs_%Y_%m_%d
>>>>>>>> >
>>>>>>>> > real 0m0.045s
>>>>>>>> > user 0m0.010s
>>>>>>>> > sys 0m0.017s
>>>>>>>> >
>>>>>>>> > This does work, but much more slowly that I expected:
>>>>>>>> >
>>>>>>>> > # time rasql -t 2013/04/29 -M time 1d -r mysql://user@host/argus_ip/ipAddrs_%Y_%m_%d - host 66.249.76.186
>>>>>>>> > StartTime Flgs Proto SrcAddr Sport Dir DstAddr Dport TotPkts TotBytes State
>>>>>>>> > 04/29/13 00:00:00.000000 0 66.249.76.186 <-> 0.0.0.0 2465 1964858 CON
>>>>>>>> >
>>>>>>>> > real 0m46.419s
>>>>>>>> > user 0m4.245s
>>>>>>>> > sys 0m0.991s
>>>>>>>> >
>>>>>>>> > # time rasql -t 2013/04/29+1d -M time 1d -r mysql://user@host/argus_ip/ipAddrs_%Y_%m_%d - host 66.249.76.186
>>>>>>>> > ^C
>>>>>>>> > real 26m18.117s
>>>>>>>> > user 0m0.037s
>>>>>>>> > sys 0m0.165s
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > I'm guessing that the second form (using a filter to rasql) is pulling back ALL records from mySQL and filtering that, whereas the first format should be using mySQL to do the filtering (the preferred method). So I suppose my questions are "Is this last statement correct?" and "What am I doing wrong with the first form of query?"
>>>>>>>> >
>>>>>>>> > Is the query speed for the working format inline with expectations? Seems a bit slow...
>>>>>>>> >
>>>>>>>> > Cheers,
>>>>>>>> >
>>>>>>>> > Jesse
>>>>>>>> > --
>>>>>>>> > Jesse Bowling
>>>>>>>> >
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Jesse Bowling
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Jesse Bowling
>>>
>>>
>>>
>>> --
>>> Jesse Bowling
>
>
>
> --
> Jesse Bowling
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20130910/14526402/attachment.html>
More information about the argus
mailing list