[ARGUS] ra stops unexpectedly
slif at bellsouth.net
slif at bellsouth.net
Thu Sep 30 13:59:32 EDT 2004
Thank you for the explanation.
I work better with illumination!
-MIke
>
> From: Carter Bullard <carter at qosient.com>
> Date: 2004/09/30 Thu PM 01:56:22 EDT
> To: <slif at bellsouth.net>,
> Peter Van Epp <vanepp at sfu.ca>,
> Argus <argus-info at lists.andrew.cmu.edu>
> Subject: Re: [ARGUS] ra stops unexpectedly
>
> Hey Mike,
> Well, you are projecting your desire for a feature and building
> a rather obtuse religious argument for its justification. TCP
> tries hard because that is its design, a reliable transport
> protocol. Why does UDP not try so hard? Well that's its design.
>
> If you want to understand why engineering reliability into
> transports where its not needed is not necessarily a good thing,
> look at the issues with using SCTP for non-reliable transport.
> I think its unnecessary, expensive and sometimes unpredictable.
>
> But the reality is simple. If you want the clients to have a
> persistent connection feature, then we should talk about it.
>
> There are three specific reasons why its not there now. The
> first is that we want to have a simple, consistent failure model
> for ra* clients. Once you advertise that you're "reliable", you
> get into some complex code to actually provide the feature.
>
> Second, all ra() clients can connect to multiple sources
> simultaneously, which makes a simple persistent connection feature
> pretty complicated (if one fails, do you shutdown all of them and
> start over?).
>
> The third is that its not clear that all clients should
> persistently connect to a remote data source, so do we need
> to put it into the general strategy?
>
> None of this means we can't provide a "reconnect on failure"
> feature, but what are we going to specify when you're connected
> to 3 remote data sources? How do we notify the specific client
> that a source has been lost, or has not ever been connected?
>
> Carter
>
>
>
>
> > From: <slif at bellsouth.net>
> > Date: Thu, 30 Sep 2004 13:35:24 -0400
> > To: Carter Bullard <carter at qosient.com>, Peter Van Epp <vanepp at sfu.ca>, Argus
> > <argus-info at lists.andrew.cmu.edu>
> > Subject: Re: Re: [ARGUS] ra stops unexpectedly
> >
> >
> >>
> >> From: Carter Bullard <carter at qosient.com>
> >> Date: 2004/09/30 Thu AM 11:38:09 EDT
> >> To: <slif at bellsouth.net>,
> >> Peter Van Epp <vanepp at sfu.ca>,
> >> Argus <argus-info at lists.andrew.cmu.edu>
> >> Subject: Re: [ARGUS] ra stops unexpectedly
> >>
> >> The problem is that if you aren't receiving MAR records,
> >> then the for argus is probably dead, and you won't receive
> >> anything ever again.
> >
> >
> > Why does the TCP protocol try so hard ?
> > In part because the authors realized there are so many
> > ways to make re-synchronizing painful and problematic.
> >
> > Stopping when one "feels" a far end point is no longer connected
> > just doesn't seem right. Sure I _can_ write yet another script
> > to monitor this program. I would prefer to indicate more
> > that "Hey, I had to restart that process". I would likely not
> > know the reason for the process terminating. Without that
> > information, I will have more difficulty trying to apply
> > a remedy.
> >
> > The solution that is localized to the problem is the easiest
> > to maintain.
> >
> >
> >
> >
> >
> >
> >>
> >> So what's to keep the user from writing a script to respawn
> >> ra(), if that's what the user wants it do? That's pretty easy
> >> isn't it?
> >>
> >>
> >>
> >>
> >>> From: <slif at bellsouth.net>
> >>> Date: Wed, 29 Sep 2004 18:00:56 -0400
> >>> To: Peter Van Epp <vanepp at sfu.ca>, <argus-info at lists.andrew.cmu.edu>
> >>> Subject: Re: Re: [ARGUS] ra stops unexpectedly
> >>>
> >>> I don't see the justification for stopping based on
> >>> not seeing MAR records. If the connecction was not reset by peer,
> >>> I would prefer the client do everything it possibly can
> >>> to connect to its server.
> >>>
> >>> If the connection breaks, throw a log message and try again.
> >>> If that fails, wait one minute.
> >>> Repeat until an operator or user stops the client.
> >>>
> >>> Then again, I don't know whether the argus clients meet
> >>> the expectations of other users.
> >>>
> >>>
> >>>
> >>>>
> >>>> From: Peter Van Epp <vanepp at sfu.ca>
> >>>> Date: 2004/09/29 Wed PM 05:28:06 EDT
> >>>> To: argus-info at lists.andrew.cmu.edu
> >>>> Subject: Re: [ARGUS] ra stops unexpectedly
> >>>>
> >>>> It looks like this shouldn't happen :-). Even on an idle link you
> >>>> should be getting mar records every reporting interval and that (perhaps
> >>>> anyway) should reset the counter I'd expect. As a quick workaround (until
> >>>> Carter can suggest what may really be wrong :-)) try commenting out the
> >>>> timeout
> >>>> in argus_parse.c:
> >>>>
> >>>> at line 2737
> >>>>
> >>>> ArgusAdjustGlobalTime(&ArgusRealTime);
> >>>>
> >>>> /*
> >>>> if (input->hostname && input->ArgusMarInterval) {
> >>>> if (input->ArgusLastTime.tv_sec) {
> >>>> if ((ArgusRealTime.tv_sec -
> >>>> input->ArgusLastTime.tv_sec)
> >>>>> (3 * input->ArgusMarInterval)) {
> >>>> ArgusLog (LOG_WARNING, "ArgusReadStream %s: idle
> >>>> stre
> >>>> am: closing", input->hostname);
> >>>> ArgusCloseInput(input);
> >>>> ArgusRemoteFDs[i] = NULL;
> >>>> }
> >>>> }
> >>>> }
> >>>> */
> >>>>
> >>>> That should stop the timeout, (it may also do something else
> >>>> undesirable though :-)). The trick would be to see where (and by what)
> >>>>
> >>>> input->ArgusLastTime.tv_sec
> >>>>
> >>>> is being updated. I'd expect MAR records to do that and thus avoid this.
> >>>> All
> >>>> that said my link must not get busy, because it doesn't happen here (of
> >>>> course
> >>>> the link between the two is a 3 ft crossover cable too). Could you be
> >>>> seeing
> >>>> a link interruption between the sensor and the host that ra is running on
> >>>> so
> >>>> you really don't see any MAR records for an interval? That would be another
> >>>> possibility.
> >>>>
> >>>> Peter Van Epp / Operations and Technical Support
> >>>> Simon Fraser University, Burnaby, B.C. Canada
> >>>>
> >>>> On Wed, Sep 29, 2004 at 04:53:02PM -0400, slif at bellsouth.net wrote:
> >>>>> The remote argus is from argus-2.0.6.fixes.1
> >>>>>
> >>>>> Running "ra -w FILE -S IP" from argus-clients-2.0.6.fixes.1
> >>>>>
> >>>>> "ra" will return unexpectedly.
> >>>>> This message is displayed :
> >>>>>
> >>>>> "ArgusWarning: ra[PID]: ArgusReadStream IP: idle stream: closing"
> >>>>>
> >>>>>
> >>>>> What can be done so that "ra" will not stop when stream
> >>>>> is apparently idle ?
> >>>>>
> >>>>
> >>>
> >>>
> >>
> >>
> >>
> >
> >
>
>
>
More information about the argus
mailing list