segfault at 000000000311c000 rip 000000000040fb46rsp 0000007fbffff830 error 4

Peter Van Epp vanepp at sfu.ca
Mon May 11 23:07:04 EDT 2009


	I'm not sure we shouldn't wait for a core from the recompiled argus
with the -g flag. If (as I think is true) the core is from an argus without
-g being run under gdb with code recompiled with -g things are likely too 
screwed up to draw conclusions on whats wrong yet. I think the -g flag shuts 
off a bunch of optimizations and thus substantially changes the code which 
will make gdb's interpratation of the stack suspect at this point. 

Peter Van Epp

On Mon, May 11, 2009 at 07:24:46PM -0400, Carter Bullard wrote:
> Hey Gunnar,
> The way you turn on the "-g" option is to do this in the main  
> distribution directory:
>    % touch .devel
>    % ./configure;make clean;make
>
> That will compile everything with the appropriate flags.
>
> Well, l2 really looks screwed up.  What kind of machine is this 64-bit  
> thing?
> I think we're having alignment problems, possibly.  Does it run for any 
> amount
> of time before it blows up?
>
> Carter
>
> On May 11, 2009, at 5:10 PM, Gunnar Lindberg wrote:
>
>> First of all, I re-run make with "-g" and used that with the existing
>> core file; I can't tell whether that should be OK but as far as I can
>> see it still makes some kind of sense.
>>
>>
>> [lindberg at argv ~]$ gdb argus-g core.14369
>> (gdb) where
>> #0  0x0000000000410bc2 in ArgusLoadList (l1=0x651460, l2=0x6540a0)
>>    at ArgusUtil.c:260
>> #1  0x000000000041557b in ArgusOutputProcess (arg=Variable "arg" is  
>> not available.
>> ) at ArgusOutput.c:477
>> #2  0x000000000040bb6c in ArgusProcessPacket (src=Variable "src" is  
>> not available.
>> ) at ArgusModeler.c:1324
>> #3  0x000000000040d006 in ArgusEtherPacket (user=0x2a95786010 "",  
>> h=Variable "h" is not available.
>> )
>>    at ArgusSource.c:716
>> #4  0x00000034e2f04bff in ?? () from /usr/lib64/libpcap.so.0.8.3
>> #5  0x0000000000410759 in ArgusGetPackets (src=0x2a95786010)
>>    at ArgusSource.c:2093
>> #6  0x0000000000404f83 in main (argc=1, argv=0x7fbffffe08) at  
>> argus.c:535
>>
>>
>> (gdb) print *l1
>> $3 = {start = 0x18c21f0, end = 0x17e98b0, count = 589, pushed =  
>> 3044164,
>>  popped = 0, loaded = 3043575, outputTime = {tv_sec = 0, tv_usec = 0},
>>  reportTime = {tv_sec = 0, tv_usec = 0}}
>>
>> (gdb) print *l2
>> $5 = {start = 0x85d278d99c8b81d1, end = 0x63caa47a16f1492e,
>>  count = 1579875320, pushed = 1880390013, popped = 2415426777,
>>  loaded = 3722138485, outputTime = {tv_sec = 144115210689264557,
>>    tv_usec = 14930315638210660}, reportTime = {tv_sec =  
>> -7084847654803835648,
>>    tv_usec = -5817086215248780719}}
>>
>> argus/ArgusUtil.c
>>    246 void
>>    247 ArgusLoadList(struct ArgusListStruct *l1, struct  
>> ArgusListStruct *l2)
>>    248 {
>>    249    if (l1 && l2) {
>>    250       int count;
>>    251 #if defined(ARGUS_THREADS)
>>    252       pthread_mutex_lock(&l1->lock);
>>    253       pthread_mutex_lock(&l2->lock);
>>    254 #endif
>>    255       count = l1->count;
>>    256
>>    257       if (l2->start == NULL)
>>    258          l2->start = l1->start;
>>    259       else
>>    260          l2->end->nxt = l1->start;
>>    261
>>    262       l2->end = l1->end;
>>    263       l2->count += count;
>>    264
>>    265       l1->start = NULL;
>>    266       l1->end = NULL;
>>    267       l1->loaded += count;
>>    268       l1->count = 0;
>>    269
>>    270 #if defined(ARGUS_THREADS)
>>    271       pthread_mutex_unlock(&l2->lock);
>>    272       pthread_mutex_unlock(&l1->lock);
>>    273 #endif
>>    274
>>    275 #ifdef ARGUSDEBUG
>>    276    ArgusDebug (5, "ArgusLoadList (0x%x, 0x%x) load %d objects 
>> \n", l1, l2        , count);
>>    277 #endif
>>    278    }
>>    279 }
>>
>>
>> 		Gunnar Lindberg
>>
>>
>>> From SRS0=BzD3OK=BH=qosient.com=carter at srs.bis.na.blackberry.com   
>>> Mon May 11 13:18:08 2009
>>> Message-ID: 
>>> <2044323243-1242040666-cardhu_decombobulator_blackberry.rim.net-2042564372- at bxe1165.bisx.prod.on.blackberry 
>>> >
>>> Reply-To: carter at qosient.com
>>> References: 
>>> <E5F8710F-522D-4579-8569-A9DD5E130A06 at qosient.com><200905110551.n4B5pV62007936 at grunert.cdg.chalmers.se 
>>> >
>>> In-Reply-To: <200905110551.n4B5pV62007936 at grunert.cdg.chalmers.se>
>>> Subject: Re: [ARGUS] segfault at 000000000311c000 rip  
>>> 000000000040fb46rsp	0000007fbffff830 error 4
>>> To: "Gunnar Lindberg" <Gunnar.Lindberg at chalmers.se>,
>>>       argus-info-bounces+carter=qosient.com at lists.andrew.cmu.edu,
>>>       "Argus" <argus-info at lists.andrew.cmu.edu>
>>> From: carter at qosient.com
>>> Date: Mon, 11 May 2009 11:19:44 +0000
>>
>>> Hey Gunnar,
>>> The C level debugging in gdb() is very good, and gives you quick  
>>> access to the symbols and stack info.
>>>
>>> I have never seen problems with ArgusLoadList(), so if you have a  
>>> core file, if you could load it into gdb() and type:
>>>
>>> (gdb) where
>>> (gdb) print *l1. (assuming its in AtgusLoadList)
>>> (gdb) print *l2
>>>
>>> If not, if you could run it under gdb() until it stops, and type the 
>>> same, that would give me a good start.
>>>
>>> Carter
>>>
>>> Sent from my Verizon Wireless BlackBerry
>>>
>>> -----Original Message-----
>>> From: Gunnar Lindberg <Gunnar.Lindberg at chalmers.se>
>>>
>>> Date: Mon, 11 May 2009 07:51:31
>>> To: <argus-info at lists.andrew.cmu.edu>
>>> Subject: Re: [ARGUS] segfault at 000000000311c000 rip  
>>> 000000000040fb46
>>> 	rsp	0000007fbffff830 error 4
>>>
>>>
>>> No .threads in argus-3.0.1.beta.3
>>>
>>> My gdb knowledge is limited but I've done quite some amount of
>>> C/machine code debugging in my early days (25 years ago and MC68000
>>> I'd probably been able to write the C code from the optimized
>>> assembler :-). But, this is *86 - "same, same, but different"...
>>>
>>> Based on that I did the "disass" trick and <<<=== indicates the
>>> machine code where the crash occured. What beats me on *86 is
>>> which register is used for which C variable, but there seems to
>>> have been an offset "0x8(%rsi),%r9" involved just before - that
>>> was variables in a C struct on MC68000 and I guess it still is.
>>>
>>> So we picked up something 8 bytes into a C struct and than tried
>>> to us it as a pointer "%r10,(%r9)" - and pooof.
>>>
>>> The most probable thing is that data/pointers got screwed up minutes
>>> ago and then the bomb goes off now because we just got to that data.
>>> However, before going through the linked list of data I'd like to ask
>>> about a line of C code:
>>>
>>> argus/ArgusUtil.c:
>>>
>>> void
>>> ArgusLoadList(struct ArgusListStruct *l1, struct ArgusListStruct *l2)
>>> {
>>> ...
>>>     if (l2->start == NULL)
>>>        l2->start = l1->start;
>>>     else
>>>        l2->end->nxt = l1->start;		<=
>>> ...
>>> }
>>>
>>> The only "nxt" I find is within a "struct ArgusListRecord",
>>> but "l2" and "l2->end" points at a "struct ArgusListStruct".
>>> Could this be it?
>>>
>>> Or is there some condition where l2->end is not correctly set?
>>>
>>> 	Gunnar Lindberg
>>>
>>> May  7 16:33:30 argv kernel: argus[14369] general protection
>>> rip:410bc2 rsp:7fbffff308 error:0
>>>
>>> gdb argus.14369 /core.14369
>>> ...
>>> #0  0x0000000000410bc2 in ArgusLoadList ()
>>> (gdb) where
>>> #0  0x0000000000410bc2 in ArgusLoadList ()
>>> #1  0x000000000041557b in ArgusOutputProcess ()
>>> #2  0x000000000040bb6c in ArgusProcessPacket ()
>>> #3  0x000000000040d006 in ArgusEtherPacket ()
>>> #4  0x00000034e2f04bff in ?? () from /usr/lib64/libpcap.so.0.8.3
>>> #5  0x0000000000410759 in ArgusGetPackets ()
>>> #6  0x0000000000404f83 in main ()
>>> (gdb) disass 0x0000000000410bc2
>>> Dump of assembler code for function ArgusLoadList:
>>> 0x0000000000410ba0 <ArgusLoadList+0>:   test   %rdi,%rdi
>>> 0x0000000000410ba3 <ArgusLoadList+3>:   setne  %dl
>>> 0x0000000000410ba6 <ArgusLoadList+6>:   xor    %eax,%eax
>>> 0x0000000000410ba8 <ArgusLoadList+8>:   test   %rsi,%rsi
>>> 0x0000000000410bab <ArgusLoadList+11>:  setne  %al
>>> 0x0000000000410bae <ArgusLoadList+14>:  test   %eax,%edx
>>> 0x0000000000410bb0 <ArgusLoadList+16>:  je     0x410be9  
>>> <ArgusLoadList+73>
>>> 0x0000000000410bb2 <ArgusLoadList+18>:  cmpq   $0x0,(%rsi)
>>> 0x0000000000410bb6 <ArgusLoadList+22>:  mov    0x10(%rdi),%ecx
>>> 0x0000000000410bb9 <ArgusLoadList+25>:  je     0x410bf0  
>>> <ArgusLoadList+80>
>>> 0x0000000000410bbb <ArgusLoadList+27>:  mov    0x8(%rsi),%r9
>>> 0x0000000000410bbf <ArgusLoadList+31>:  mov    (%rdi),%r10
>>> 0x0000000000410bc2 <ArgusLoadList+34>:  mov    %r10,(%r9)        
>>> <<<===
>>> 0x0000000000410bc5 <ArgusLoadList+37>:  mov    0x8(%rdi),%r11
>>> 0x0000000000410bc9 <ArgusLoadList+41>:  add    %ecx,0x1c(%rdi)
>>> 0x0000000000410bcc <ArgusLoadList+44>:  add    %ecx,0x10(%rsi)
>>> 0x0000000000410bcf <ArgusLoadList+47>:  movq   $0x0,(%rdi)
>>> 0x0000000000410bd6 <ArgusLoadList+54>:  movl   $0x0,0x10(%rdi)
>>> 0x0000000000410bdd <ArgusLoadList+61>:  mov    %r11,0x8(%rsi)
>>> 0x0000000000410be1 <ArgusLoadList+65>:  movq   $0x0,0x8(%rdi)
>>> 0x0000000000410be9 <ArgusLoadList+73>:  repz retq
>>> 0x0000000000410beb <ArgusLoadList+75>:  data16
>>> 0x0000000000410bec <ArgusLoadList+76>:  data16
>>> 0x0000000000410bed <ArgusLoadList+77>:  nop
>>> 0x0000000000410bee <ArgusLoadList+78>:  data16
>>> 0x0000000000410bef <ArgusLoadList+79>:  nop
>>> 0x0000000000410bf0 <ArgusLoadList+80>:  mov    (%rdi),%r8
>>> 0x0000000000410bf3 <ArgusLoadList+83>:  mov    %r8,(%rsi)
>>> 0x0000000000410bf6 <ArgusLoadList+86>:  jmp    0x410bc5  
>>> <ArgusLoadList+37>
>>> 0x0000000000410bf8 <ArgusLoadList+88>:  data16
>>> 0x0000000000410bf9 <ArgusLoadList+89>:  data16
>>> 0x0000000000410bfa <ArgusLoadList+90>:  data16
>>> 0x0000000000410bfb <ArgusLoadList+91>:  nop
>>> 0x0000000000410bfc <ArgusLoadList+92>:  data16
>>> 0x0000000000410bfd <ArgusLoadList+93>:  data16
>>> 0x0000000000410bfe <ArgusLoadList+94>:  data16
>>> 0x0000000000410bff <ArgusLoadList+95>:  nop
>>> End of assembler dump.
>>> (gdb) info registers
>>> rax            0x1      1
>>> rbx            0x174f450        24441936
>>> rcx            0x24d    589
>>> rdx            0x4a02f101       1241706753
>>> rsi            0x6540a0 6635680
>>> rdi            0x651460 6624352
>>> rbp            0x6516c0 0x6516c0
>>> rsp            0x7fbffff308     0x7fbffff308
>>> r8             0x69c6d  433261
>>> r9             0x63caa47a16f1492e       7190740599328295214
>>> r10            0x18c21f0        25960944
>>> r11            0x41a1320        68817696
>>> r12            0x3      3
>>> r13            0x651738 6625080
>>> r14            0x0      0
>>> r15            0x7fbffff510     548682069264
>>> rip            0x410bc2 0x410bc2 <ArgusLoadList+34>
>>> eflags         0x10286  66182
>>> cs             0x33     51
>>> ss             0x2b     43
>>> ds             0x0      0
>>> es             0x0      0
>>> fs             0x0      0
>>> gs             0x0      0
>>>
>>>
>>>
>>>> From carter at qosient.com  Thu May  7 19:00:53 2009
>>>> Cc: argus-info at lists.andrew.cmu.edu
>>>> Message-Id: <E5F8710F-522D-4579-8569-A9DD5E130A06 at qosient.com>
>>>> From: Carter Bullard <carter at qosient.com>
>>>> To: Gunnar Lindberg <Gunnar.Lindberg at chalmers.se>
>>>> In-Reply-To: <200905071507.n47F7xeB026201 at grunert.cdg.chalmers.se>
>>>> Subject: Re: [ARGUS] segfault at 000000000311c000 rip  
>>>> 000000000040fb46 rsp	0000007fbffff830 error 4
>>>> Date: Thu, 7 May 2009 13:00:42 -0400
>>>> References: <200905071507.n47F7xeB026201 at grunert.cdg.chalmers.se>
>>>
>>>> Hey Gunnar,
>>>> The gdb() commands of interest are:
>>>
>>>>   (gdb) where
>>>
>>>> ArgusLoadList() is the routine that passes flow record status  
>>>> reports
>>>> from the
>>>> packet processing engine to the output processor.  This definitely
>>>> shouldn't
>>>> have a problem, so it will be interesting to figure out what the
>>>> problem maybe.
>>>
>>>> Are you running with threads enabled?  (is there a ./.threads file  
>>>> in
>>>> your root directory?)
>>>
>>>> Carter
>>>
>>>
>>
>





More information about the argus mailing list