rasqlinsert disconnect issues
David Edelman
dedelman at iname.com
Mon Oct 6 17:19:48 EDT 2014
(gdb) thread 1
[Switching to thread 1 (Thread 0xb7fe68d0 (LWP 21287))]#0 0x00d73424 in __kernel_vsyscall ()
(gdb) print queue
No symbol "queue" in current context.
(gdb) thread 3
[Switching to thread 3 (Thread 0xb720bb90 (LWP 21297))]#0 0x00d73424 in __kernel_vsyscall ()
(gdb) print queue
No symbol "queue" in current context.
(gdb) thread 4
[Switching to thread 4 (Thread 0xb6709b90 (LWP 21298))]#0 0x00d73424 in __kernel_vsyscall ()
(gdb) print queue
No symbol "queue" in current context.
(gdb) thread 5
[Switching to thread 5 (Thread 0xb5d08b90 (LWP 21299))]#0 0x00d73424 in __kernel_vsyscall ()
(gdb) print queue
No symbol "queue" in current context.
(gdb) thread 6
[Switching to thread 6 (Thread 0xb5307b90 (LWP 21300))]#0 0x08052827 in RaClientSortQueue (sorter=0x977e8b8, queue=0x977e690, type=2) at ./raclient.c:2523
2523 qhdr = qhdr->nxt;
(gdb) print queue
$15 = (struct ArgusQueueStruct *) 0x977e690
(gdb) print *queue
$16 = {count = 55, status = 268435456, arraylen = 0, lock = {__data = {__lock = 0, __count = 0, __owner = 0, __kind = 0, __nusers = 4294967237, {__spins = 0, __list = {__next = 0x0}}},
__size = '\0' <repeats 16 times>, "����\000\000\000", __align = 0}, start = 0xb2522750, end = 0xb25078c0, array = 0x9883d20}
(gdb) thread 7
[Switching to thread 7 (Thread 0xb4906b90 (LWP 21301))]#0 0x0052f315 in free () from /lib/libc.so.6
(gdb) print queue
No symbol "queue" in current context.
(gdb) thread 8
[Switching to thread 8 (Thread 0xb3f05b90 (LWP 21302))]#0 0x00d73424 in __kernel_vsyscall ()
(gdb) print queue
No symbol "queue" in current context.
(gdb)
From: Carter Bullard [mailto:carter at qosient.com]
Sent: Monday, October 06, 2014 4:56 PM
To: David Edelman
Cc: Argus
Subject: Re: [ARGUS] rasqlinsert disconnect issues
Hey Dave,
Sorry about that !!!
(gdb) thread 1
(gdb) print queue
(gdb) thread 2
...
Carter
On Oct 6, 2014, at 4:41 PM, David Edelman <dedelman at iname.com <mailto:dedelman at iname.com> > wrote:
Before I screw it up, what is the syntax to do that?
Dave Edelman
On Oct 6, 2014, at 16:31, Carter Bullard <carter at qosient.com <mailto:carter at qosient.com> > wrote:
Looks like every thread is in the same routine, trying to deal with a queue ... hopefully not the same queue, but possible, and that would be a bug . Sorry if this is inconvenient, but could print out the queue in each thread. Looking at this now.
Carter
On Oct 6, 2014, at 3:08 PM, David Edelman <dedelman at iname.com <mailto:dedelman at iname.com> > wrote:
(gdb) where
#0 0x08052827 in RaClientSortQueue (sorter=0x977e8b8, queue=0x977e690, type=2) at ./raclient.c:2523
#1 0x08059b62 in ArgusDrawWindow (ws=0x98502e0) at ./rasqlinsert.c:3006
#2 0x080530c1 in ArgusOutputProcess (arg=0x0) at ./rasqlinsert.c:458
#3 0x0066c51f in start_thread () from /lib/libpthread.so.0
#4 0x005a204e in clone () from /lib/libc.so.6
(gdb) info threads
8 Thread 0xb3f05b90 (LWP 21302) 0x00d73424 in __kernel_vsyscall ()
7 Thread 0xb4906b90 (LWP 21301) 0x0052f315 in free () from /lib/libc.so.6
6 Thread 0xb5307b90 (LWP 21300) 0x08052827 in RaClientSortQueue (sorter=0x977e8b8, queue=0x977e690, type=2) at ./raclient.c:2523
5 Thread 0xb5d08b90 (LWP 21299) 0x00d73424 in __kernel_vsyscall ()
4 Thread 0xb6709b90 (LWP 21298) 0x00d73424 in __kernel_vsyscall ()
3 Thread 0xb720bb90 (LWP 21297) 0x00d73424 in __kernel_vsyscall ()
* 1 Thread 0xb7fe68d0 (LWP 21287) 0x00d73424 in __kernel_vsyscall ()
(gdb) thread 8
[Switching to thread 8 (Thread 0xb3f05b90 (LWP 21302))]#0 0x00d73424 in __kernel_vsyscall ()
(gdb) list
2523 qhdr = qhdr->nxt;
2524 }
2525
2526 queue->array[i] = NULL;
2527
2528 if (!(type & ARGUS_NOSORT)) {
2529 qsort ((char *) queue->array, x, sizeof (struct ArgusQueueHeader *), ArgusSortRoutine);
2530
2531 for (i = 0; i < x; i++) {
2532 struct ArgusRecordStruct *ns = (struct ArgusRecordStruct *) queue->array[i];
(gdb) thread 7
[Switching to thread 7 (Thread 0xb4906b90 (LWP 21301))]#0 0x0052f315 in free () from /lib/libc.so.6
(gdb) list
2543
2544 RaSortItems = x;
2545 bzero (&ArgusParser->ArgusStartTimeVal, sizeof(ArgusParser->ArgusStartTimeVal));
2546
2547 #if defined(ARGUS_THREADS)
2548 if (type & ARGUS_LOCK)
2549 pthread_mutex_unlock(&queue->lock);
2550 #endif
2551
2552 #ifdef ARGUSDEBUG
(gdb) thread 6
[Switching to thread 6 (Thread 0xb5307b90 (LWP 21300))]#0 0x08052827 in RaClientSortQueue (sorter=0x977e8b8, queue=0x977e690, type=2) at ./raclient.c:2523
2523 qhdr = qhdr->nxt;
(gdb) list
2518 keep = 0;
2519 }
2520
2521 if (keep)
2522 queue->array[x++] = qhdr;
2523 qhdr = qhdr->nxt;
2524 }
2525
2526 queue->array[i] = NULL;
2527
(gdb) thread 5
[Switching to thread 5 (Thread 0xb5d08b90 (LWP 21299))]#0 0x00d73424 in __kernel_vsyscall ()
(gdb) list
2528 if (!(type & ARGUS_NOSORT)) {
2529 qsort ((char *) queue->array, x, sizeof (struct ArgusQueueHeader *), ArgusSortRoutine);
2530
2531 for (i = 0; i < x; i++) {
2532 struct ArgusRecordStruct *ns = (struct ArgusRecordStruct *) queue->array[i];
2533 if (ns->rank != (i + 1)) {
2534 ns->rank = i + 1;
2535 ns->status |= ARGUS_RECORD_MODIFIED;
2536 }
2537 }
(gdb)
(gdb) thread 4
[Switching to thread 4 (Thread 0xb6709b90 (LWP 21298))]#0 0x00d73424 in __kernel_vsyscall ()
(gdb) list
2528 if (!(type & ARGUS_NOSORT)) {
2529 qsort ((char *) queue->array, x, sizeof (struct ArgusQueueHeader *), ArgusSortRoutine);
2530
2531 for (i = 0; i < x; i++) {
2532 struct ArgusRecordStruct *ns = (struct ArgusRecordStruct *) queue->array[i];
2533 if (ns->rank != (i + 1)) {
2534 ns->rank = i + 1;
2535 ns->status |= ARGUS_RECORD_MODIFIED;
2536 }
2537 }
(gdb) thread 3
[Switching to thread 3 (Thread 0xb720bb90 (LWP 21297))]#0 0x00d73424 in __kernel_vsyscall ()
(gdb) list
2538 }
2539
2540 } else
2541 ArgusLog (LOG_ERR, "RaClientSortQueue: ArgusMalloc(%d) %s\n", sizeof(struct ArgusRecord *), cnt, strerror(errno));
2542 }
2543
2544 RaSortItems = x;
2545 bzero (&ArgusParser->ArgusStartTimeVal, sizeof(ArgusParser->ArgusStartTimeVal));
2546
2547 #if defined(ARGUS_THREADS)
(gdb) thread 1
(gdb) list
2548 if (type & ARGUS_LOCK)
2549 pthread_mutex_unlock(&queue->lock);
2550 #endif
2551
2552 #ifdef ARGUSDEBUG
2553 ArgusDebug (5, "RaClientSortQueue(0x%x, 0x%x, %d) returned\n", sorter, queue, type);
2554 #endif
2555 }
2556
(gdb)
From: Carter Bullard [mailto:carter at qosient.com]
Sent: Monday, October 06, 2014 12:02 PM
To: David Edelman
Cc: Argus
Subject: Re: [ARGUS] rasqlinsert disconnect issues
so thread 21287 complains about the mysql error, and thread 21300 is trying to sort the queue. I suspect that 21287 is exiting or has exited, and 21300 is unaware that the sql 'backend' has raised the done flag.
can you type 'where' so we can see which thread is dumping ??
On Oct 4, 2014, at 8:27 PM, David Edelman <dedelman at iname.com <mailto:dedelman at iname.com> > wrote:
GDB to the rescue, I have the session hanging out in a screens instance if you need anything specific, just let me know.
—Dave
(gdb) run -M time 1d -M cache -S localhost:561 -w mysql://argus@localhost/argus/test2macAddrs_%Y_%m_%d -m srcid saddr smac -s stime ltime srcid saddr smac -M rmon - ipv4
Starting program: /layered_products/argus-clients-3.0.8/bin/rasqlinsert -M time 1d -M cache -S localhost:561 -w mysql://argus@localhost/argus/test2macAddrs_%Y_%m_%d -m srcid saddr smac -s stime ltime srcid saddr smac -M rmon - ipv4
[Thread debugging using libthread_db enabled]
[New Thread 0xb7fe68d0 (LWP 21287)]
Detaching after fork from child process 21288.
[New Thread 0xb720bb90 (LWP 21289)]
[Thread 0xb720bb90 (LWP 21289) exited]
[New Thread 0xb720bb90 (LWP 21297)]
[New Thread 0xb6709b90 (LWP 21298)]
[New Thread 0xb5d08b90 (LWP 21299)]
[New Thread 0xb5307b90 (LWP 21300)]
[New Thread 0xb4906b90 (LWP 21301)]
[New Thread 0xb3f05b90 (LWP 21302)]
[New Thread 0xb3504b90 (LWP 21303)]
[Thread 0xb3504b90 (LWP 21303) exited]
rasqlinsert[21287]: Sun 2014-10-05 00:00:14.081 mysql_real_query error Query was empty
rasqlinsert[21287]: Sun 2014-10-05 00:00:50.374 mysql_real_query error Query was empty
rasqlinsert[21287]: Sun 2014-10-05 00:00:53.109 mysql_real_query error Query was empty
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xb5307b90 (LWP 21300)]
0x08052827 in RaClientSortQueue (sorter=0x977e8b8, queue=0x977e690, type=2) at ./raclient.c:2523
2523 qhdr = qhdr->nxt;
Missing separate debuginfos, use: debuginfo-install libgcc-4.3.2-7.i386
(gdb) list
2518 keep = 0;
2519 }
2520
2521 if (keep)
2522 queue->array[x++] = qhdr;
2523 qhdr = qhdr->nxt;
2524 }
2525
2526 queue->array[i] = NULL;
2527
(gdb)
From: David Edelman <dedelman at iname.com <mailto:dedelman at iname.com> >
Date: Thursday, October 2, 2014 at 4:50 PM
To: Carter Bullard <carter at qosient.com <mailto:carter at qosient.com> >
Cc: Argus <argus-info at lists.andrew.cmu.edu <mailto:argus-info at lists.andrew.cmu.edu> >
Subject: Re: [ARGUS] rasqlinsert disconnect issues
Okay, I'll try that.
On Oct 2, 2014, at 00:34, Carter Bullard <carter at qosient.com <mailto:carter at qosient.com> > wrote:
D3 should print the sql calls, which maybe all that you need to see.
Carter
On Oct 1, 2014, at 9:46 PM, David Edelman <dedelman at iname.com <mailto:dedelman at iname.com> > wrote:
Carter,
I’m running rasqlinsert using –S localhost:561 to process the output of radium which is doing labeling. I’ll fire up an instance of the release code with the same parameter but a different table name and see if there is anything obvious. I’ve built the release code with both .debug and .devel do you have a recommendation for a debug value?
—Dave
From: Carter Bullard <carter at qosient.com <mailto:carter at qosient.com> >
Date: Monday, September 29, 2014 at 1:51 PM
To: David Edelman <dedelman at iname.com <mailto:dedelman at iname.com> >
Cc: "John T. Myers" <myersj0 at gmail.com <mailto:myersj0 at gmail.com> >, Argus <argus-info at lists.andrew.cmu.edu <mailto:argus-info at lists.andrew.cmu.edu> >
Subject: Re: [ARGUS] rasqlinsert disconnect issues
Hey Dave,
Well, that’s not what we’re striving for, so
if we can capture what that is all about, I’ll
fix as soon as I can.
Carter
On Sep 28, 2014, at 5:47 PM, David Edelman <dedelman at iname.com <mailto:dedelman at iname.com> > wrote:
The running but seems to no update the database was a problem that I reported and you fixed in one of the very last release candidates. The current 3.0.8 does not have that problem as best I can tell but it does stop after between 8-20 hours with no log messages that I have been able to find.
--Dave
From: Carter Bullard [mailto:carter at qosient.com]
Sent: Sunday, September 28, 2014 11:34 AM
To: David Edelman
Cc: John T. Myers; Argus
Subject: Re: [ARGUS] rasqlinsert disconnect issues
Well the earlier rasqlinsert.1 code does have a problem that the new one tried
to fix. I’ll try to see what may be up with mysql_real_query() error messages,
and see if we missed something there.
But in the OP, the problem is that rasqlinsert.1 is running, but not updating
the database, in this second report, rasqlinsert.1 is failing ??
Carter
On Sep 25, 2014, at 8:11 PM, David Edelman < <mailto:dedelman at iname.com> dedelman at iname.com> wrote:
Carter,
I am seeing something similar with Netflow data processed by Radium (labels added) and rasqlinsert reading the processed radium data from port 561. In my case the rasqlinsert process dies without any error messages (it is built with .debug and .devel) It isn’t practical for me to enable debuigger output since the failure can be many hours into the run.
The one clue that I do have it that the release candidate set (3.0.8-rc1) worked fine.
I haven’t had time to do much more than go back to the working release.
--Dave
From: <mailto:argus-info-bounces+dedelman=iname.com at lists.andrew.cmu.edu> argus-info-bounces+dedelman=iname.com at lists.andrew.cmu.edu [ <mailto:argus-info-bounces+dedelman=iname.com at lists.andrew.cmu.edu> mailto:argus-info-bounces+dedelman=iname.com at lists.andrew.cmu.edu] On Behalf Of Carter Bullard
Sent: Thursday, September 25, 2014 1:34 PM
To: John T. Myers
Cc: Argus
Subject: Re: [ARGUS] rasqlinsert disconnect issues
Hey John,
rasqlinsert() is multi-threaded, and its possible that
the cache concurrency thread, the one that is managing the
database cache, exited if the database fails.
rasqlinsert() should close, if that thread is done.
So you are seeing rasqlinsert() is still running, but
not updating the database ?? Is rasqlinsert() getting
bigger ??? (should be generating INSERT and UPDATE
requests, but the DB thread is not processing them ???)
Carter
On Sep 25, 2014, at 10:35 AM, John T. Myers < <mailto:myersj0 at gmail.com> myersj0 at gmail.com> wrote:
Hello,
I was wondering if it’s normal behavior for rasqlinsert to cease inserting netflow into the database if the connection becomes interrupted? It seems if the mysql database is restarted or any part of the connection between rasqlinsert and the db is broken, it will not attempt to re-connect and continue flow insertion.
Thanks!
John
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20141006/9658e1b3/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 6217 bytes
Desc: not available
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20141006/9658e1b3/attachment.bin>
More information about the argus
mailing list