Problem with argus under load not reopening output file.

Martijn van Oosterhout kleptog at gmail.com
Tue Jan 13 05:31:35 EST 2009


Ok, I've confirmed that the timestamps in tcpdump jump back and forth
for a few packets. Happens about twice a day. There a patch in the
latest kernels (2.6.26 onwards) that fixes it so it looks like we'll
be upgrading.

The file switch is just the most obvious effect, because together with
the archiving script it caused chunks of data to be missing. Once I
fixed the archiving script it appears that there is data missing, or
at least, the flows don't total the global packet counters anymore.

Not entirely sure what Argus should do about this, it's a bug in the
kernel that's just plain hard to workaround in userspace.

Have a nice day,

On Mon, Jan 12, 2009 at 2:35 PM,  <carter at qosient.com> wrote:
> ArgusGlobalTime is used to manage all the queue timeouts as it is time relative to all flow data, so a jump will cause us to flush and delete all caches.
>
> Packets with bizarre times will affect generally single flows.
>
> ArgusRealTime is our system clock and controls many timeout functions.  That is the current issue, but I'm interested that it only affected your file switch behavior.
>
> I don't want to test the time every packet, but I can do a bounds check on the packet time stamp
>
> Let me think about this.
>
> Headed to Flocon 2009 today.  If anyone is there be sure and say hi!!!!!
>
> Carter
> Sent from my Verizon Wireless BlackBerry
>
> -----Original Message-----
> From: Martijn van Oosterhout <kleptog at gmail.com>
>
> Date: Sun, 11 Jan 2009 23:53:34
> To: Peter Van Epp<vanepp at sfu.ca>
> Cc: <argus-info at lists.andrew.cmu.edu>
> Subject: Re: [ARGUS] Problem with argus under load not reopening output file.
>
>
> On Sun, Jan 11, 2009 at 10:49 PM, Peter Van Epp <vanepp at sfu.ca> wrote:
>> On Sun, Jan 11, 2009 at 01:17:49PM -0500, Carter Bullard wrote:
>>> Hmmm,
>>> Should we for packet streams that are "PCAP_OPEN_LIVE" interfaces:
>>>    1) Adjust the time for packets that are way out of scope (>  5-10
>>> seconds away from real time)
>>>        to current time.
>
>>        I'd be in favor of 1) (so as to not lose any packets) and not adjusting
>> global time at least immediately. Thats an ugly bug :-), it looks like it will
>> affect everything except a DAG with internal time keeping turned on as the
>> kernel's sense of time is being messed with, so the packet could be ahead by
>> an hour (if it hits during interrupt when the packet is getting tine stamped)
>> or argus could get hit if it happens when argus asks for time of day.
>
> The simple fix I proposed (changing the less-then to not-equals) will
> fix the not-reopening-of-output-file I originally ran into. All that
> happened there is that the lastwritten got stuck in the future and
> that fixes it.
>
> What actual problem could it cause if GlobalTime were are hour ahead
> for a single packet? If that doesn't matter much, then I'd suggest
> only worrying about those variables which could be stuck for longer
> periods, like lastwritten.
>
>>        Debugging will be interesting (as in may you live in interesting times
>> :-)). Setting argus packet capture on (preferably writing to a ram disk for
>> speed) would be one obvious choice. If you see a time jump forward and then
>> back in sequential packets in the pcap then you likely have a kernel level
>> time problem.
>
> That's what I'm trying. I have a packet logger so with any luck I
> should catching it in the act.
>
> Have a nice day,
> --
> Martijn van Oosterhout <kleptog at gmail.com> http://svana.org/kleptog/
>
>



-- 
Martijn van Oosterhout <kleptog at gmail.com> http://svana.org/kleptog/



More information about the argus mailing list