bug + "Queue Exceeded Maximum Limit"

Carter Bullard carter at qosient.com
Wed Dec 3 16:13:24 EST 2003


Hey William,
   Thanks for the bug report, I've got it fixed in my image,
so it will make it in the new release.

   So the problem is that you are generating more records than
you are sending out of the engine.  The queue gets big, and
so argus decides that the consumer can't keep up (doesn't know
why) and it just disconnects.  You can increase this queue length
which may buy you all the time you need by changing
ArgusMaxListLength in ./server/ArgusUtil.c to something like
1M.

   So is argus writing to a disk or a socket?  If a disk, then
more than likely the disk is the limiting factor.  Sending the
records off the box to another machine is the solution there.
If to a socket, then there are some internal variables that
you can change so that you can ride out the wave of high flow
reports.

   Basically, argus will only process so many records per
attempt, as its doing all this work in between packet reads.
The number of records per turn is defined by the variable
ARGUS_MAXWRITENUM in ./server/ArgusUtil.c.  You should
probably raise this to something like 25000.

   Argus will delete records when it gets into trouble.
You may want to increase the variable ARGUS_DROPRECORDNUM
to something like 5000, in order to keep you running.
While this is drastic, it will keep you from dying.  Because
there are sequence numbers in argus records, detecting a
big chunk of records being deleted is not hard to do.

   And of course what is top saying about the load on the
machine.  If its pegged at near 100% then you may need to
get a bigger machine.

Hope this helps!!!!!

Carter



-----Original Message-----
From: owner-argus-info at lists.andrew.cmu.edu
[mailto:owner-argus-info at lists.andrew.cmu.edu] On Behalf Of William Setzer
Sent: Wednesday, December 03, 2003 3:42 PM
To: argus at lists.andrew.cmu.edu
Subject: bug + "Queue Exceeded Maximum Limit"


I currently have argus 2.0.5 (run on Solaris 2.6) hooked to a firehose
(a very busy 1G link), and lately have been getting this happening
almost constantly:

  Dec  3 06:29:05 argus: ArgusWriteOutSocket(0x154960) Queue Count 232797
  Dec  3 06:29:36 argus: ArgusWriteOutSocket(0x154960) Queue Count 236682
  Dec  3 06:30:09 argus: ArgusWriteOutSocket(0x154960) Queue Count 244167
  Dec  3 06:30:47 argus: ArgusWriteOutSocket(0x154960) Queue Count 245161
  Dec  3 06:31:18 argus: ArgusWriteOutSocket(0x154960) Queue Count 254993
  Dec  3 06:31:20 argus: ArgusWriteOutSocket(0x154960) Queue Exceeded
Maximum Limit
  Dec  3 06:31:20 argus: ArgusHandleData: ArgusWriteOutSocket failed Bad
file number
  Dec  3 06:31:20 argus: ArgusHandleData: Terminating process 3673

The queue seems to overflow about once an hour.  I've tried adjusting
the MAR timeout and increased the ARGUS_MAXWRITENUM constant in
ArgusUtil.c, but those only seem to stretch the problem out a bit.
Does anyone have any other ideas on how to flush the queue out more
often?  I'm afraid I don't understand the internals of argus to know
when and how often ArgusWriteOutSocket gets called.

(The bug is that the ArgusHandleData error about a Bad file number is
incorrect.  The section of code in ArgusWriteOutSocket that printed
out Queue Exceeded Maximum Limit returns -1, which causes
ArgusHandleData to print out an error based on 'errno', when it wasn't
a system call that caused the error.)

Thanks in advance for any advice.


William






More information about the argus mailing list