[ARGUS] Re: Re: 2.0.6.fixes.1 core from improper signal
Carter Bullard
carter at qosient.com
Thu Nov 11 12:02:54 EST 2004
Hmmmm, I meant that this particular signal handler should not use system
resources like calloc(), free() etc.... We're probably saying the same
thing.
Ok, so SIGCHILD is indicating that a child has exited, which is not
a good thing, regardless of where we are. What we're trying to do
in the handler is to find the child output process that has exited
and clear out its output queue and remove it from the output client
array. We could do this later, so to speak, and we do periodically
go through the clients array, killing around to see if anyone has
gone away. It looks we check 10X/sec for lost children, if there isn't
a new process hitting the listen(), in order to catch zombies.
We do this in ArgusOutputProcess(). Bottom line is we may not
even need to set a global variable.
I'd say lets ignore SIGCHILD?
Carter
> From: <slif at bellsouth.net>
> Date: Thu, 11 Nov 2004 11:11:55 -0500
> To: Carter Bullard <carter at qosient.com>
> Cc: Argus <argus-info at lists.andrew.cmu.edu>
> Subject: Re: Re: [ARGUS] Re: Re: 2.0.6.fixes.1 core from improper signal
>
>
>> From: Carter Bullard <carter at qosient.com>
>> I'd like to see if we can make the signal handler's
>> a bit more immune to resource problems, so that we can
>> have some probability of surviving the system error.
>
> This was my purpose in announcing the problem. I'm sorry
> if my writing did not make this clear.
>
>>
>> So what do you think happened in your case, did argus
>> run out of memory?
>>
>> Carter
>
> There was plenty of free memory. The library chain was indeterminate.
>
> I think calloc() was interrupted while it was
> adjusting the free memory list. This adjustment looked
> like a corrupt list to free(). If the signal hander
> just set a flag and returned, calloc() would have resumed,
> and would have completed its changes to the free list.
>
> Here is the evidence:
> Line 17 shows the initial signal 20=SIGCHLD on FreeBSD
> Note that ArgusChildExit is executing as if it was called
> from calloc. This is because the software interrupt is
> still active.
>
> Line 14 is where the problem is manifested. Free() threw
> an abort, probably when it fell off of the incomplete
> free list that calloc() was processing.
>
>
>
>
> 1 GNU gdb 5.2.1 (FreeBSD) Copyright 2002 Free Software Foundation, Inc.
> 2 [Standard stuff and reading/loading symbols removed for brevity ...]
> 3 This GDB was configured as "i386-unknown-freebsd"...
> 4 Core was generated by `argus'.
> 5 Program terminated with signal 6, Aborted.
> 6 #0 0x2811edcf in kill () from /lib/libc.so.5
> 7 (gdb) bt
> 8 #0 0x2811edcf in kill () from /lib/libc.so.5
> 9 #1 0x28113878 in raise () from /lib/libc.so.5
> 10 #2 0x2818bf82 in abort () from /lib/libc.so.5
> 11 #3 0x2818a6fe in tcflow () from /lib/libc.so.5
> 12 #4 0x2818a72b in tcflow () from /lib/libc.so.5
> 13 #5 0x2818b459 in free () from /lib/libc.so.5
> 14 #6 0x0805e617 in ArgusFree (buf=0x2819a120) at argus_filter.c:5425
> 15 #7 0x080529ac in ArgusPopFrontList (list=0x8131160) at
> ArgusUtil.c:228
> 16 #8 0x08050a07 in ArgusCloseSocket (i=1) at ArgusOutput.c:1287
> 17 #9 0x0804ef47 in ArgusChildExit (sig=20) at ArgusOutput.c:359
> 18 #10 <signal handler called>
> 19 #11 0x2818b741 in realloc () from /lib/libc.so.5
> 20 #12 0x2818ae1e in tcflow () from /lib/libc.so.5
> 21 #13 0x2818af74 in tcflow () from /lib/libc.so.5
> 22 #14 0x2818b356 in malloc () from /lib/libc.so.5
> 23 #15 0x28187b61 in calloc () from /lib/libc.so.5
> 24 #16 0x0805e5f4 in ArgusCalloc (nitems=1, size=12) at
> argus_filter.c:5410
> 25 #17 0x0805395e in ArgusWriteSocket (asock=0x8157000, buf=0x8147780
> "\001 ",
> 26 cnt=88) at ArgusUtil.c:977
> 27 #18 0x0804ffc3 in ArgusHandleData (asock=0x8146000, buf=0x8147780
> "\001 ",
> 28 len=88, client=0x0) at ArgusOutput.c:857
> 29 #19 0x080535c6 in ArgusReadSocket (asock=0x8146000,
> 30 ArgusThisHandler=0x804fe88 <ArgusHandleData>, data=0x0) at
> ArgusUtil.c:847
> 31 #20 0x0804f3f3 in ArgusOutputProcess () at ArgusOutput.c:439
> 32 #21 0x0804e850 in ArgusInitOutput () at ArgusOutput.c:132
> 33 #22 0x0804ac7c in main (argc=1, argv=0xbfbfecb0) at argus.c:421
> 34 #23 0x0804a1a2 in _start ()
> 35 (gdb) q
>
>
>
More information about the argus
mailing list