radium fails (rc.50 from 2007-08-31)

Carter Bullard carter at qosient.com
Tue Sep 4 09:38:59 EDT 2007


Peter,
We should leave the threaded version for argus-3.1, and assume that
all testing should be on the non-threaded version from now on.  But
curiosity may require a little explanation of the threaded model.

When running with the threads stuff enabled, argus will have 2 threads,
at a minimum:

Attaching to program: `/usr/local/sbin/argus', process 13992.
Reading symbols for shared libraries . done
0x9001f888 in select ()

(gdb) info threads
   2 process 13992 thread 0xb03  0x90054388 in  
semaphore_timedwait_signal_trap ()
* 1 process 13992 thread 0x203  0x9001f888 in select ()

(the * indicates the current thread)

(gdb) where
#0  0x9001f888 in select ()
#1  0x000161d0 in ArgusGetPackets (src=0x30501c) at ArgusSource.c:1648
#2  0x000046ac in main (argc=2, argv=0xbffffb0c) at argus.c:545
#3  0x0000280c in _start ()
#4  0x00002510 in start ()

(gdb) thread 2
[Switching to thread 2 (process 13992 thread 0xb03)]
0x90054388 in semaphore_timedwait_signal_trap ()
(gdb) where
#0  0x90054388 in semaphore_timedwait_signal_trap ()
#1  0x900541e4 in pthread_cond_timedwait ()
#2  0x0001f0ac in ArgusOutputProcess (arg=0x4004fc) at ArgusOutput.c:432
#3  0x9002bd08 in _pthread_body ()

This is normal.  Thread 1 is the packet processor/modeler and is  
packet driven,
and thread 2 is the flow output processor.

When you HUP this, the main thread (packet processor) catches the  
signal,
closes the packet source, timeouts all the flows, and generates the  
closing
MAR record, which the output processor picks up, and uses as its  
indication
to close, so that it will write out closing MARs to its clients, and  
then wait for
them to close.

The main thread will "join" the output thread, and wait until it  
closes.  I
suspect that because your client went away, the output processor is
having issues trying to finish, which may take a lot of time, of course
making everything wait.  I'll look into it.

Carter





On Sep 3, 2007, at 8:03 PM, Peter Van Epp wrote:

> 	To follow up my own post as usual, the issue looks to be threads
> deadlocking. I started an argus with no -J but a -D2 and it hung  
> the same way.
> I then foolishly listed the debug output file before saving the two  
> gdb
> outputs and lost them, but it was the same as before the argus was  
> sitting
> at select doing nothing and when HUPed there was a hung thread. A  
> look through
> the 900 meg debug file does indicate it is a new thread in an  
> apparant list
> allocation loop (which is likely eating memory and is in any case  
> not working).
> I've started it again and I expect it to fail again at which time  
> I'll get
> more data. So it looks like a no .threads version of argus-3.0.0  
> works but
> .threads currently doesn't.
>
> Peter Van Epp / Operations and Technical Support
> Simon Fraser University, Burnaby, B.C. Canada
>



More information about the argus mailing list