new version of argus-2.0.0l

Carter Bullard carter at qosient.com
Mon Oct 2 21:12:19 EDT 2000


Hey Peter,
   Looks like we're getting closer.  Yes the signal handling
does need some work.  I'll put more logic around the selects
and the writes, (wherever we can be in the kernel when we
get the signal) to make sure that we do the right thing.

   The strategy is that the top most process gets the signals,
all the other processes ignore signals.  When we send a SIGHUP,
as an example, the top process should be the only one to catch
the signal, and it immediately goes to ArgusShutDown(int).
This process immediately turns of the pcap interface,
deletes the modeler, which flushes all the cached Argus Records,
and then send a Closing ARGUS_MAR.

The other processes are looking for the Closing ARGUS_MAR,
the Output processor will multicast the closing record to
all its children (one for the file and one for each remote
connection) and wait for them all to finish before it
exits.

On paper it looks good, but .......

I'll put in some more logic tomorrow to see if we can't
get it right.

Thanks for all the hard work!!!
What do you want to do about the FreeBSD problem?

Carter


-----Original Message-----
From: owner-argus at lists.andrew.cmu.edu
[mailto:owner-argus at lists.andrew.cmu.edu]On Behalf Of Peter Van Epp
Sent: Monday, October 02, 2000 7:45 PM
To: argus
Subject: Re: new version of argus-2.0.0l


	While the new 2.0.0l works fine on Solaris at 1 meg, at 100 megs it
hangs pretty consistantly when HUPed (I only got one successful HUP in about
10 tries, and that one worse luck, had a previous no data HUP in the argus
log
so the size is a little big, but ra appears to have captured all the packets
successfully at 100 (or more correctly 95 or 96 megabits per second, however
fast tcpreplay could get the data there).
	There looks to be a loop in the first task on the HUP note the CPU time
climbing:

skaha# !312
bin/argus_dlpi -i qfe0 -w argus.log &
[1] 5649
skaha# kill -HUP 5649
skaha#

skaha# !ps
ps -ef | grep argus
    root  5650  5649  0 16:34:08 pts/1    0:00 bin/argus_dlpi -i qfe0 -w
argus.log
    root  5651  5650  0 16:34:08 pts/1    0:00 bin/argus_dlpi -i qfe0 -w
argus.log
    root  5653   534  0 16:34:41 pts/1    0:00 grep argus
    root  5649   534 29 16:34:08 pts/1    0:18 bin/argus_dlpi -i qfe0 -w
argus.log
skaha# !!
ps -ef | grep argus
    root  5655   534  0 16:34:48 pts/1    0:00 grep argus
    root  5650  5649  0 16:34:08 pts/1    0:00 bin/argus_dlpi -i qfe0 -w
argus.log
    root  5651  5650  0 16:34:08 pts/1    0:00 bin/argus_dlpi -i qfe0 -w
argus.log
    root  5649   534 35 16:34:08 pts/1    0:24 bin/argus_dlpi -i qfe0 -w
argus.log

a while later:

skaha# !ps
ps -ef | grep argus
    root  5650  5649  0 16:34:08 pts/1    0:00 bin/argus_dlpi -i qfe0 -w
argus.log
    root  5651  5650  0 16:34:08 pts/1    0:00 bin/argus_dlpi -i qfe0 -w
argus.log
    root  5657   534  0 16:40:04 pts/1    0:00 grep argus
    root  5649   534 50 16:34:08 pts/1    5:40 bin/argus_dlpi -i qfe0 -w
argus.log

	Then a HUP again:

skaha# !333
kill -HUP 5649
skaha#
[1]    Hangup               bin/argus_dlpi -i qfe0 -w argus.log
skaha# !ps
ps -ef | grep argus
    root  5650     1  0 16:34:08 pts/1    0:00 bin/argus_dlpi -i qfe0 -w
argus.log
    root  5651  5650  0 16:34:08 pts/1    0:00 bin/argus_dlpi -i qfe0 -w
argus.log
    root  5659   534  0 16:40:35 pts/1    0:00 grep argus
skaha#

	Now I have to kill -9 the last two tasks and the argus.log file isn't
complete. If you have suggestions on what we can do to debug this it looks
easily reproducable :-)


Peter Van Epp / Operations and Technical Support
Simon Fraser University, Burnaby, B.C. Canada

>
> Thanks to Peter finding a major gotcha in the Solaris
> port, there is now a new copy of argus at
> ftp://qosient.com/dev/argus/argus-2.0
>
<snip>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20001002/b084d6ca/attachment.html>


More information about the argus mailing list