Problem with argus under load not reopening output file.

Peter Van Epp vanepp at sfu.ca
Sun Jan 11 16:49:16 EST 2009


On Sun, Jan 11, 2009 at 01:17:49PM -0500, Carter Bullard wrote:
> Hmmm,
> Well, if the mailing list can decide how to respond to the bug,  then  
> I'll put a fix in this week.
> Should we for packet streams that are "PCAP_OPEN_LIVE" interfaces:
>    1) Adjust the time for packets that are way out of scope (>  5-10  
> seconds away from real time)
>        to current time.
>
>    2) drop packets that are way out of scope
>
> In both cases, we should not adjust our notion of global time?
>
> Carter
>

	I'd be in favor of 1) (so as to not lose any packets) and not adjusting
global time at least immediately. Thats an ugly bug :-), it looks like it will
affect everything except a DAG with internal time keeping turned on as the
kernel's sense of time is being messed with, so the packet could be ahead by
an hour (if it hits during interrupt when the packet is getting tine stamped)
or argus could get hit if it happens when argus asks for time of day. I have
a horrible suspicion we need to do a reasonablness check on time of day calls
if they are ahead an hour suddenly do it again and see if it drops back and if
so ignore it, but thats likely to add a lot of overhead (and only be needed on 
multicore systems and then only sometimes). 
	I can't say that we have seen this so far, but argus prod is still
2.0.6 running on FreeBSD the 3.0 system on Linux (with Phil Wood's pcap code
rather than PF ring this time) hasn't been up much lately. Hopefully that will
change soon as our new hire is learning by building a new argus archiving box
that will run 3.0 hopefully in production soon. Although that said we have run
both PF-ring on multiprocessor systems and the new pcap code on a 200 meg 
link for at least weeks at a time without problem.
	Debugging will be interesting (as in may you live in interesting times
:-)). Setting argus packet capture on (preferably writing to a ram disk for
speed) would be one obvious choice. If you see a time jump forward and then 
back in sequential packets in the pcap then you likely have a kernel level 
time problem.
	We may need to do add a debug level 1 or 2 check of timeofday calls
and log sudden jumps forward and back again to verify it is system time and 
not something in argus thats broken.

Peter Van Epp / Operations and Technical Support 
Simon Fraser University, Burnaby, B.C. Canada



More information about the argus mailing list