segfault at 000000000311c000 rip 000000000040fb46rsp 0000007fbffff830 error 4

Gunnar Lindberg Gunnar.Lindberg at chalmers.se
Wed Jul 1 09:59:04 EDT 2009


The idea that this may be strings is interesting, so just like
Carter I took my old ASCII chart - but it didn't say me much.
And, next crash that occured yesterday afternoon made me go for
the 8-bit 8859 chart - i.e. I think its more random data.

{start = 0x2029706f742e656c, end = 0x702e73696874202b,
{start = 0xb9d2bcfac6fa3f4c, end = 0x3d078cfe490497a0, 


>Hey Gunnar, any chance you can use valgrind() to see if we're doing
>something wrong with memory?

# ll -o .devel .threads
/bin/ls: .threads: No such file or directory
-rw-rw-r--  1 root 0 May 15 07:13 .devel

I assume you mean something like
  /usr/bin/valgrind [-???] /usr/local/sbin/argus >& /var/log/xxx.log &
and than we see what's in xxx.log after a crash. Right? We have some
kind of cron based watchdog and I guess we leave that as is, so that
we get xxx.log once and are back in ordinary business after.

Mixed news, good and bad... :-).

1) Sweden has longer holidays than the US and I'm looking forward
   to my 5 weeks, starting on Mon Jun 6; back Mon Aug 10. And, for
   once I'm going to stay away completely, not even email :-), so
   I'll resist the tempting "let's just set it up, only once...".

2) Since I'm not at all familiar with valgrind I would appreciate
   some advice on "-???".

So, at middle of Aug I guess we can do a "valgrind thing".

Possibly that could be combined with an idea we have to capture
that last batch of data - I'm reluctant to writing raw capture
data to a file, but I think we can save what was actually written
up to just before the crash; just needs some watchdog adjustment.

And, of course we must stay open for the possibility that what we
have is just a chunk of bad memory. Since both machines' argus
crash I consider memory fault unlikely, but both are the same age,
so it's not entirely impossible. What would be the best mem test?

	Gunnar Lindberg

Latest
-rwxrwxr-x  1 root 829739 Jun  1 07:47 argus
-rw-r--r--  1 root 70807552 Jun 30 14:51 core.1584
argc# gdb argus.1584 core.1584
#0  0x0000003fabc705f2 in strcmp () from /lib64/tls/libc.so.6
#1  0x0000003fabc81d50 in __tzstring () from /lib64/tls/libc.so.6
#2  0x0000003fabc83b43 in __tzfile_compute () from /lib64/tls/libc.so.6
#3  0x0000003fabc82c8b in __tz_convert () from /lib64/tls/libc.so.6
#4  0x0000003fabcc5abe in vsyslog () from /lib64/tls/libc.so.6
#5  0x0000003fabcc6066 in syslog () from /lib64/tls/libc.so.6
#6  0x000000000041569c in ArgusLoadList (l1=0x659460, l2=0x65c0a0)
    at ArgusUtil.c:273
#7  0x000000000041a439 in ArgusOutputProcess (arg=0x6596c0)
    at ArgusOutput.c:477
#8  0x0000000000408339 in ArgusProcessPacket (src=0x2a95786010, p=0x65bb12 "", 
    length=105, tvp=0x7fbffff4b0, type=0) at ArgusModeler.c:1324
#9  0x00000000004107db in ArgusEtherPacket (user=0x2a95786010 "", 
    h=0x7fbffff530, p=0x65bb12 "") at ArgusSource.c:716
#10 0x0000003fac904bff in ?? () from /usr/lib64/libpcap.so.0.8.3
#11 0x0000000000413cd2 in ArgusGetPackets (src=0x2a95786010)
    at ArgusSource.c:2099
#12 0x0000000000404c77 in main (argc=1, argv=0x7fbffffe08) at argus.c:535

#6  0x000000000041569c in ArgusLoadList (l1=0x659460, l2=0x65c0a0)
    at ArgusUtil.c:273
273     ArgusUtil.c: No such file or directory.
        in ArgusUtil.c
(gdb) print *l1
$1 = {start = 0x1127f60, end = 0x112f9e0, count = 270, pushed = 348323494, 
  popped = 0, loaded = 348323224, outputTime = {tv_sec = 0, tv_usec = 0}, 
  reportTime = {tv_sec = 0, tv_usec = 0}}
(gdb) print *l2
$2 = {start = 0xb9d2bcfac6fa3f4c, end = 0x3d078cfe490497a0, 
  count = 1664177081, pushed = 1214938457, popped = 734219693, 
  loaded = 323085166, outputTime = {tv_sec = -3436253370747411246, 
    tv_usec = 7772834831553979568}, reportTime = {
    tv_sec = -3784555927640503799, tv_usec = -3225818299882675799}}


>From carter at qosient.com Wed Jul  1 01:50:19 2009
>From: Carter Bullard <carter at qosient.com>
>To: Peter Van Epp <vanepp at sfu.ca>
>CC: "argus-info at lists.andrew.cmu.edu" <argus-info at lists.andrew.cmu.edu>
>Sender: "argus-info-bounces+gunnar.lindberg=chalmers.se at lists.andrew.cmu.edu"
>	<argus-info-bounces+gunnar.lindberg=chalmers.se at lists.andrew.cmu.edu>
>Date: Wed, 1 Jul 2009 01:49:27 +0200
>Subject: Re: [ARGUS] segfault at 000000000311c000 rip	000000000040fb46rsp
>	0000007fbffff830 error 4
>Message-ID: <F0888929-AF98-42D7-85EB-9FAF15AB082E at qosient.com>
>References: <78C956B9-F7C0-4E75-A37B-843A293386FF at qosient.com>
>	<200906290719.n5T7JvTF026686 at grunert.cdg.chalmers.se>
>	<20090630223342.GA27655 at sfu.ca>
>In-Reply-To: <20090630223342.GA27655 at sfu.ca>

>Hey Peter,
>Something is writing over something, just can't seem to find a handle.
>The ArgusLoadList() is passing ArgusListRecords from the Modeler to
>the Output processor, and it just takes the two link lists and combines
>them.  If there is nothing in the receive list, its just a "move the  
>pointers"
>and there you go.   The receive list should be empty if the output
>processor is keeping ahead of the load.

>My guess is that we're getting the length of an output record wrong,
>which can happen if you're sloppy forming a DSR that you rarely use,
>so it could be a packet specific bug still, or we are using a buffer  
>that
>has been deallocated/reallocated and  we're stomping on the new
>users buffer.

>This can happen in threaded applications, so turning off the .threads
>tag may be a good test.

>Hey Gunnar, any chance you can use valgrind() to see if we're doing
>something wrong with memory?

>Carter




More information about the argus mailing list