Stability problems.
Carter Bullard
carter at qosient.com
Thu Jun 7 11:28:29 EDT 2001
Hey Chris,
Sounds like your truncating records on the argus end.
One possibility. You are probably forcibly deleting
records at the argus end to control the queue sizes, and
you maybe running into a bug with that. A record is partially
written, but because of queue load, we elect to delete it.
This would not be good, as only a partial record is written.
The receiving ra can detect this and recover, but not for a
period of time.
Look in your /var/log/messages for argus messages,
especially "Queue Exceeded Max" messages. This would indicate
that you are throwing records away.
Changing the value of ArgusMaxListLength should help.
Use this patch:
Index: ArgusUtil.c
===================================================================
RCS file: /usr/local/cvsroot/argus/server/ArgusUtil.c,v
retrieving revision 1.77.2.4
diff -r1.77.2.4 ArgusUtil.c
800c800
< int ArgusMaxListLength = 16384;
---
> int ArgusMaxListLength = 262144;
The value doesn't have to be a binary number, I just happen
to like them. I'll take a look at the delete logic.
Carter
Carter Bullard
QoSient, LLC
300 E. 56th Street, Suite 18K
New York, New York 10022
carter at qosient.com
Phone +1 212 588-9133
Fax +1 212 588-9134
http://qosient.com
-----Original Message-----
From: Chris Newton [mailto:newton at unb.ca]
Sent: Thursday, June 07, 2001 8:57 AM
To: Carter Bullard
Subject: Stability problems.
Hi Carter.
Since I moved into client/server mode, I have had a few bumps of
instability.
I'm running the most current code.
The sensor is a linux 2.4.x redhat 7.1 box, 512 MB ram, 600 MB swap,
dual
800 Mhz cpus
The recieving end has dual 1 Ghz CPUs and 1.2 GB ram. Ra is running
on
this, dumping to local files.
We are monitoring a link that has a possible traffic rate of a full
duplex,
100 MBit connection.
Sometimes we are receiving DoS attacks that cause the server to grow
and
grow and grow... it doesn't appear to dump it's records to the attached
client
at a fast enough rate to make sure the box doesnt run out of memory.
When the server gets in this state, it starts sending invalid records
to the
client. Some of these records have incredible duration times.. (1 had a
135
year duration).
Today, I'm not sure what occured, but:
01-06-07 08:43:34 0.000000 Fs 131 1.4.0.104 <-> 0.144.8.0 991914214
180000
180000 3459164706 CON
0 duration. The other IP involved was 0.144.8.0 (not possible). The
991M
src packets, is in 30 seconds..., but only 180K.
For a number of minutes after an event like this, the ra client has
trouble
getting anything meaningful out of the server... often only outputting
flow
record files (for 30 seconds) with very few flows in it, the next file
with
lots of flows... so on.
attached is the flow record file.
Let me know how I can help you track down this problem.
Thanks Carter
_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
Chris Newton, Systems Analyst
Computing Services, University of New Brunswick
newton at unb.ca 506-447-3212(voice) 506-453-3590(fax)
"The best way to have a good idea is to have a lot of ideas." Linus
Pauling (1901 - 1994) US chemist
More information about the argus
mailing list