segfault at 000000000311c000 rip 000000000040fb46rsp 0000007fbffff830 error 4
Peter Van Epp
vanepp at sfu.ca
Mon May 11 23:07:04 EDT 2009
I'm not sure we shouldn't wait for a core from the recompiled argus
with the -g flag. If (as I think is true) the core is from an argus without
-g being run under gdb with code recompiled with -g things are likely too
screwed up to draw conclusions on whats wrong yet. I think the -g flag shuts
off a bunch of optimizations and thus substantially changes the code which
will make gdb's interpratation of the stack suspect at this point.
Peter Van Epp
On Mon, May 11, 2009 at 07:24:46PM -0400, Carter Bullard wrote:
> Hey Gunnar,
> The way you turn on the "-g" option is to do this in the main
> distribution directory:
> % touch .devel
> % ./configure;make clean;make
>
> That will compile everything with the appropriate flags.
>
> Well, l2 really looks screwed up. What kind of machine is this 64-bit
> thing?
> I think we're having alignment problems, possibly. Does it run for any
> amount
> of time before it blows up?
>
> Carter
>
> On May 11, 2009, at 5:10 PM, Gunnar Lindberg wrote:
>
>> First of all, I re-run make with "-g" and used that with the existing
>> core file; I can't tell whether that should be OK but as far as I can
>> see it still makes some kind of sense.
>>
>>
>> [lindberg at argv ~]$ gdb argus-g core.14369
>> (gdb) where
>> #0 0x0000000000410bc2 in ArgusLoadList (l1=0x651460, l2=0x6540a0)
>> at ArgusUtil.c:260
>> #1 0x000000000041557b in ArgusOutputProcess (arg=Variable "arg" is
>> not available.
>> ) at ArgusOutput.c:477
>> #2 0x000000000040bb6c in ArgusProcessPacket (src=Variable "src" is
>> not available.
>> ) at ArgusModeler.c:1324
>> #3 0x000000000040d006 in ArgusEtherPacket (user=0x2a95786010 "",
>> h=Variable "h" is not available.
>> )
>> at ArgusSource.c:716
>> #4 0x00000034e2f04bff in ?? () from /usr/lib64/libpcap.so.0.8.3
>> #5 0x0000000000410759 in ArgusGetPackets (src=0x2a95786010)
>> at ArgusSource.c:2093
>> #6 0x0000000000404f83 in main (argc=1, argv=0x7fbffffe08) at
>> argus.c:535
>>
>>
>> (gdb) print *l1
>> $3 = {start = 0x18c21f0, end = 0x17e98b0, count = 589, pushed =
>> 3044164,
>> popped = 0, loaded = 3043575, outputTime = {tv_sec = 0, tv_usec = 0},
>> reportTime = {tv_sec = 0, tv_usec = 0}}
>>
>> (gdb) print *l2
>> $5 = {start = 0x85d278d99c8b81d1, end = 0x63caa47a16f1492e,
>> count = 1579875320, pushed = 1880390013, popped = 2415426777,
>> loaded = 3722138485, outputTime = {tv_sec = 144115210689264557,
>> tv_usec = 14930315638210660}, reportTime = {tv_sec =
>> -7084847654803835648,
>> tv_usec = -5817086215248780719}}
>>
>> argus/ArgusUtil.c
>> 246 void
>> 247 ArgusLoadList(struct ArgusListStruct *l1, struct
>> ArgusListStruct *l2)
>> 248 {
>> 249 if (l1 && l2) {
>> 250 int count;
>> 251 #if defined(ARGUS_THREADS)
>> 252 pthread_mutex_lock(&l1->lock);
>> 253 pthread_mutex_lock(&l2->lock);
>> 254 #endif
>> 255 count = l1->count;
>> 256
>> 257 if (l2->start == NULL)
>> 258 l2->start = l1->start;
>> 259 else
>> 260 l2->end->nxt = l1->start;
>> 261
>> 262 l2->end = l1->end;
>> 263 l2->count += count;
>> 264
>> 265 l1->start = NULL;
>> 266 l1->end = NULL;
>> 267 l1->loaded += count;
>> 268 l1->count = 0;
>> 269
>> 270 #if defined(ARGUS_THREADS)
>> 271 pthread_mutex_unlock(&l2->lock);
>> 272 pthread_mutex_unlock(&l1->lock);
>> 273 #endif
>> 274
>> 275 #ifdef ARGUSDEBUG
>> 276 ArgusDebug (5, "ArgusLoadList (0x%x, 0x%x) load %d objects
>> \n", l1, l2 , count);
>> 277 #endif
>> 278 }
>> 279 }
>>
>>
>> Gunnar Lindberg
>>
>>
>>> From SRS0=BzD3OK=BH=qosient.com=carter at srs.bis.na.blackberry.com
>>> Mon May 11 13:18:08 2009
>>> Message-ID:
>>> <2044323243-1242040666-cardhu_decombobulator_blackberry.rim.net-2042564372- at bxe1165.bisx.prod.on.blackberry
>>> >
>>> Reply-To: carter at qosient.com
>>> References:
>>> <E5F8710F-522D-4579-8569-A9DD5E130A06 at qosient.com><200905110551.n4B5pV62007936 at grunert.cdg.chalmers.se
>>> >
>>> In-Reply-To: <200905110551.n4B5pV62007936 at grunert.cdg.chalmers.se>
>>> Subject: Re: [ARGUS] segfault at 000000000311c000 rip
>>> 000000000040fb46rsp 0000007fbffff830 error 4
>>> To: "Gunnar Lindberg" <Gunnar.Lindberg at chalmers.se>,
>>> argus-info-bounces+carter=qosient.com at lists.andrew.cmu.edu,
>>> "Argus" <argus-info at lists.andrew.cmu.edu>
>>> From: carter at qosient.com
>>> Date: Mon, 11 May 2009 11:19:44 +0000
>>
>>> Hey Gunnar,
>>> The C level debugging in gdb() is very good, and gives you quick
>>> access to the symbols and stack info.
>>>
>>> I have never seen problems with ArgusLoadList(), so if you have a
>>> core file, if you could load it into gdb() and type:
>>>
>>> (gdb) where
>>> (gdb) print *l1. (assuming its in AtgusLoadList)
>>> (gdb) print *l2
>>>
>>> If not, if you could run it under gdb() until it stops, and type the
>>> same, that would give me a good start.
>>>
>>> Carter
>>>
>>> Sent from my Verizon Wireless BlackBerry
>>>
>>> -----Original Message-----
>>> From: Gunnar Lindberg <Gunnar.Lindberg at chalmers.se>
>>>
>>> Date: Mon, 11 May 2009 07:51:31
>>> To: <argus-info at lists.andrew.cmu.edu>
>>> Subject: Re: [ARGUS] segfault at 000000000311c000 rip
>>> 000000000040fb46
>>> rsp 0000007fbffff830 error 4
>>>
>>>
>>> No .threads in argus-3.0.1.beta.3
>>>
>>> My gdb knowledge is limited but I've done quite some amount of
>>> C/machine code debugging in my early days (25 years ago and MC68000
>>> I'd probably been able to write the C code from the optimized
>>> assembler :-). But, this is *86 - "same, same, but different"...
>>>
>>> Based on that I did the "disass" trick and <<<=== indicates the
>>> machine code where the crash occured. What beats me on *86 is
>>> which register is used for which C variable, but there seems to
>>> have been an offset "0x8(%rsi),%r9" involved just before - that
>>> was variables in a C struct on MC68000 and I guess it still is.
>>>
>>> So we picked up something 8 bytes into a C struct and than tried
>>> to us it as a pointer "%r10,(%r9)" - and pooof.
>>>
>>> The most probable thing is that data/pointers got screwed up minutes
>>> ago and then the bomb goes off now because we just got to that data.
>>> However, before going through the linked list of data I'd like to ask
>>> about a line of C code:
>>>
>>> argus/ArgusUtil.c:
>>>
>>> void
>>> ArgusLoadList(struct ArgusListStruct *l1, struct ArgusListStruct *l2)
>>> {
>>> ...
>>> if (l2->start == NULL)
>>> l2->start = l1->start;
>>> else
>>> l2->end->nxt = l1->start; <=
>>> ...
>>> }
>>>
>>> The only "nxt" I find is within a "struct ArgusListRecord",
>>> but "l2" and "l2->end" points at a "struct ArgusListStruct".
>>> Could this be it?
>>>
>>> Or is there some condition where l2->end is not correctly set?
>>>
>>> Gunnar Lindberg
>>>
>>> May 7 16:33:30 argv kernel: argus[14369] general protection
>>> rip:410bc2 rsp:7fbffff308 error:0
>>>
>>> gdb argus.14369 /core.14369
>>> ...
>>> #0 0x0000000000410bc2 in ArgusLoadList ()
>>> (gdb) where
>>> #0 0x0000000000410bc2 in ArgusLoadList ()
>>> #1 0x000000000041557b in ArgusOutputProcess ()
>>> #2 0x000000000040bb6c in ArgusProcessPacket ()
>>> #3 0x000000000040d006 in ArgusEtherPacket ()
>>> #4 0x00000034e2f04bff in ?? () from /usr/lib64/libpcap.so.0.8.3
>>> #5 0x0000000000410759 in ArgusGetPackets ()
>>> #6 0x0000000000404f83 in main ()
>>> (gdb) disass 0x0000000000410bc2
>>> Dump of assembler code for function ArgusLoadList:
>>> 0x0000000000410ba0 <ArgusLoadList+0>: test %rdi,%rdi
>>> 0x0000000000410ba3 <ArgusLoadList+3>: setne %dl
>>> 0x0000000000410ba6 <ArgusLoadList+6>: xor %eax,%eax
>>> 0x0000000000410ba8 <ArgusLoadList+8>: test %rsi,%rsi
>>> 0x0000000000410bab <ArgusLoadList+11>: setne %al
>>> 0x0000000000410bae <ArgusLoadList+14>: test %eax,%edx
>>> 0x0000000000410bb0 <ArgusLoadList+16>: je 0x410be9
>>> <ArgusLoadList+73>
>>> 0x0000000000410bb2 <ArgusLoadList+18>: cmpq $0x0,(%rsi)
>>> 0x0000000000410bb6 <ArgusLoadList+22>: mov 0x10(%rdi),%ecx
>>> 0x0000000000410bb9 <ArgusLoadList+25>: je 0x410bf0
>>> <ArgusLoadList+80>
>>> 0x0000000000410bbb <ArgusLoadList+27>: mov 0x8(%rsi),%r9
>>> 0x0000000000410bbf <ArgusLoadList+31>: mov (%rdi),%r10
>>> 0x0000000000410bc2 <ArgusLoadList+34>: mov %r10,(%r9)
>>> <<<===
>>> 0x0000000000410bc5 <ArgusLoadList+37>: mov 0x8(%rdi),%r11
>>> 0x0000000000410bc9 <ArgusLoadList+41>: add %ecx,0x1c(%rdi)
>>> 0x0000000000410bcc <ArgusLoadList+44>: add %ecx,0x10(%rsi)
>>> 0x0000000000410bcf <ArgusLoadList+47>: movq $0x0,(%rdi)
>>> 0x0000000000410bd6 <ArgusLoadList+54>: movl $0x0,0x10(%rdi)
>>> 0x0000000000410bdd <ArgusLoadList+61>: mov %r11,0x8(%rsi)
>>> 0x0000000000410be1 <ArgusLoadList+65>: movq $0x0,0x8(%rdi)
>>> 0x0000000000410be9 <ArgusLoadList+73>: repz retq
>>> 0x0000000000410beb <ArgusLoadList+75>: data16
>>> 0x0000000000410bec <ArgusLoadList+76>: data16
>>> 0x0000000000410bed <ArgusLoadList+77>: nop
>>> 0x0000000000410bee <ArgusLoadList+78>: data16
>>> 0x0000000000410bef <ArgusLoadList+79>: nop
>>> 0x0000000000410bf0 <ArgusLoadList+80>: mov (%rdi),%r8
>>> 0x0000000000410bf3 <ArgusLoadList+83>: mov %r8,(%rsi)
>>> 0x0000000000410bf6 <ArgusLoadList+86>: jmp 0x410bc5
>>> <ArgusLoadList+37>
>>> 0x0000000000410bf8 <ArgusLoadList+88>: data16
>>> 0x0000000000410bf9 <ArgusLoadList+89>: data16
>>> 0x0000000000410bfa <ArgusLoadList+90>: data16
>>> 0x0000000000410bfb <ArgusLoadList+91>: nop
>>> 0x0000000000410bfc <ArgusLoadList+92>: data16
>>> 0x0000000000410bfd <ArgusLoadList+93>: data16
>>> 0x0000000000410bfe <ArgusLoadList+94>: data16
>>> 0x0000000000410bff <ArgusLoadList+95>: nop
>>> End of assembler dump.
>>> (gdb) info registers
>>> rax 0x1 1
>>> rbx 0x174f450 24441936
>>> rcx 0x24d 589
>>> rdx 0x4a02f101 1241706753
>>> rsi 0x6540a0 6635680
>>> rdi 0x651460 6624352
>>> rbp 0x6516c0 0x6516c0
>>> rsp 0x7fbffff308 0x7fbffff308
>>> r8 0x69c6d 433261
>>> r9 0x63caa47a16f1492e 7190740599328295214
>>> r10 0x18c21f0 25960944
>>> r11 0x41a1320 68817696
>>> r12 0x3 3
>>> r13 0x651738 6625080
>>> r14 0x0 0
>>> r15 0x7fbffff510 548682069264
>>> rip 0x410bc2 0x410bc2 <ArgusLoadList+34>
>>> eflags 0x10286 66182
>>> cs 0x33 51
>>> ss 0x2b 43
>>> ds 0x0 0
>>> es 0x0 0
>>> fs 0x0 0
>>> gs 0x0 0
>>>
>>>
>>>
>>>> From carter at qosient.com Thu May 7 19:00:53 2009
>>>> Cc: argus-info at lists.andrew.cmu.edu
>>>> Message-Id: <E5F8710F-522D-4579-8569-A9DD5E130A06 at qosient.com>
>>>> From: Carter Bullard <carter at qosient.com>
>>>> To: Gunnar Lindberg <Gunnar.Lindberg at chalmers.se>
>>>> In-Reply-To: <200905071507.n47F7xeB026201 at grunert.cdg.chalmers.se>
>>>> Subject: Re: [ARGUS] segfault at 000000000311c000 rip
>>>> 000000000040fb46 rsp 0000007fbffff830 error 4
>>>> Date: Thu, 7 May 2009 13:00:42 -0400
>>>> References: <200905071507.n47F7xeB026201 at grunert.cdg.chalmers.se>
>>>
>>>> Hey Gunnar,
>>>> The gdb() commands of interest are:
>>>
>>>> (gdb) where
>>>
>>>> ArgusLoadList() is the routine that passes flow record status
>>>> reports
>>>> from the
>>>> packet processing engine to the output processor. This definitely
>>>> shouldn't
>>>> have a problem, so it will be interesting to figure out what the
>>>> problem maybe.
>>>
>>>> Are you running with threads enabled? (is there a ./.threads file
>>>> in
>>>> your root directory?)
>>>
>>>> Carter
>>>
>>>
>>
>
More information about the argus
mailing list