ra looping problem still in Beta 8 on FreeBSD

Carter Bullard carter at qosient.com
Mon Mar 5 08:20:13 EST 2001


Hey Russell,
   I found the problem!! And, of course it was my fault.
Fix is in the final release and I've included the patch
below.

Carter

Carter Bullard
QoSient, LLC
300 E. 56th Street, Suite 18K
New York, New York  10022

carter at qosient.com
Phone +1 212 588-9133
Fax   +1 212 588-9134

Index: argus_parse.c
===================================================================
RCS file: /usr/local/cvsroot/argus/common/argus_parse.c,v
retrieving revision 1.123
diff -r1.123 argus_parse.c
1448,1449c1448,1452
<       if (!((errno == EAGAIN) || (errno == EINTR))) {
<          retn = 1;
---
>
>       retn = 1;
>
>       if ((errno == EAGAIN) || (errno == EINTR)) {
>          retn = 0;


-----Original Message-----
From: owner-argus at lists.andrew.cmu.edu
[mailto:owner-argus at lists.andrew.cmu.edu]On Behalf Of Carter Bullard
Sent: Monday, March 05, 2001 8:00 AM
To: 'Russell Fulton'; 'Argus (E-mail)'
Subject: RE: RE: ra looping problem still in Beta 8 on FreeBSD


Hey Russell,
I'll implement the exit and we'll see if it still runs, and
then you can see if it recovers in your situation.
Ragator() can eat up a lot of memory, so there will be situations
where parts of the these programs will be swapped out.  I'll go
through all the clients to make sure that when there is a failed
calloc(), that we gracefully exit.
I wonder if I need to zero out errno before I make a call
to read().  The errno may be from a previous select() and
may not have anything to do with the read() return value.
Carter
Carter Bullard
QoSient, LLC
300 E. 56th Street, Suite 18K
New York, New York  10022
carter at qosient.com
Phone +1 212 588-9133
Fax   +1 212 588-9134
> -----Original Message-----
> From: owner-argus at lists.andrew.cmu.edu
> [mailto:owner-argus at lists.andrew.cmu.edu]On Behalf Of Russell Fulton
> Sent: Sunday, March 04, 2001 10:56 PM
> To: Argus (E-mail)
> Subject: Re: RE: ra looping problem still in Beta 8 on FreeBSD
>
>
> On Sat, 3 Mar 2001 19:30:43 -0500 Carter Bullard <carter at qosient.com>
> wrote:
>
> > Hey Russell,
> >    Hmmm, EAGAIN on a read() should mean that O_NONBLOCK
> > is set and there was no data to read.  Now we shouldn't
> > have gotten here, because we aren't using non blocking
> > IO.  Also the select() should not have indicated that there
> > was anything there to read, when there wasn't.  So I'm
> > thinking that there must be a really wierd problem.
>
> i.e. the real problem isn't in ra, which is what I though.
>
> >
> >    I would suspect that we should be able to exit if
> > we get an EAGAIN, as its just not suppose to happen.
> > I'll have to test this.
> >
> >    Is your ra() the process using up a lot of memory?
> > If so we definately need to fix that.
>
> Hmmm... top now shows plenty of free memory??
>
> last pid: 48573;  load averages:  1.48,  1.81,  1.57
>                                                   up
> 22+21:26:44  10:13:04
> 32 processes:  2 running, 26 sleeping, 4 zombie
> CPU states: 50.4% user,  0.0% nice, 48.0% system,  1.6%
> interrupt,  0.0% idle
> Mem: 66M Active, 8636K Inact, 12M Wired, 2028K Cache, 21M
> Buf, 26M Free
> Swap: 244M Total, 1640K Used, 243M Free
>
>   PID USERNAME PRI NICE  SIZE    RES STATE    TIME   WCPU
> CPU COMMAND
> 47987 argus     63   0  2212K  1480K RUN    505:40 78.81% 78.81% ra
>
> Ahh... Memory runs short every hour when I use ragator and gzip to
> compact the the log files.  I'll try starting the slowscan job after
> they finish, it does not take an hour to run.
>
> hmmm... the ra process needs -9 to kill it.
>
> Here is a chunk of ps output, ra itself isn't using that much memory,
> it is the perl scipt that is hogging it.
>
>   UID   PID  PPID CPU PRI NI   VSZ  RSS WCHAN  STAT  TT
> TIME COMMAND
>  1001 47954 47950   0  10  0   616  224 wait   Is    ??
> 0:00.01 /bin/sh -c cd sw;scan_watch -q  -S -s history -l history  -d 2
>  1001 47958 47954 137  -6  0 26700 25960 piperd I     ??
> 1:32.79 /usr/bin/perl -w /home/argus/bin/scan_watch -q -S -s
> history -l histo
>  1001 47986 47958 235  10  0   620  228 wait   I     ??
> 0:00.00 sh -c /home/argus/bin/ra  -F /home/argus/lib/ra.conf
> -I -AZs -r /home
>  1001 47987 47986 259  60  0  2212 1480 -      R     ??
> 500:00.99 /home/argus/bin/ra -F /home/argus/lib/ra.conf -I
> -AZs -r /home/argus/
>
> I am pretty sure that I can work around this by splitting the job in
> two one for tcp and one for udp, thus cutting the memory need for
> the script.
>
> If we should never get EAGAIN with read returning 0 then I suggest
> that ra should simply exit with an error message.  That would stop
> the looping and alert user that something isn't quite as it should be.
>
> I have not yet worked out where it starts looping.  i.e. is it at the
> end of file or not.
>
> I guess this is a FreeBSD specific problem.
>
> Russell Fulton, Computer and Network Security Officer
> The University of Auckland,  New Zealand
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20010305/5185a0bc/attachment.html>


More information about the argus mailing list