ratop hangs server
Carter Bullard
carter at qosient.com
Mon Mar 19 16:48:33 EDT 2012
Hey Zach,
ratop.1 is a pretty complex program, but at its core its an argus data aggregator.
Without some tweaking of the variables and options, ratop.1 can use up all the memory
on a machine tracking all the flows that it sees in the argus data stream. I'm pretty sure
that you're tracking more flows than ratop.1 can handle. If you type control-g when in
ratop.1, on the bottom line it will print some stats, such as how many flows are in the
process queue, the display queue, how many its receiving per second etc….
If you could get these stats when ratop.1 is in trouble that would be helpful.
When ratop.1 can't keep up with the source of argus data, queues fill up, and the argus
data source will eventually drop the connnection to ratop.1. There may be a problem
when argus does this, based on your description.
I normally recommend that you have a radium.1 connected to argus, and have your
ratop's and other clients connect to radium, rather than argus, as this tends to protect
argus from flow control and socket problems, although it should work,…...
There are 2 basic ways of controlling the amount of memory ratop.1 uses. Timeout
the flows very quickly, or aggregate the incoming data so that you track less flows.
Run ratop with a good aggregation model like:
ratop -S source -m matrix/16
And see if it lasts longer. While this view may not have all the info you're looking for,
the test will just show if its a memory problem or not.
Carter
On Mar 19, 2012, at 3:15 PM, Zach Brown wrote:
> Using:
> argus-3.0.5.11
> argus-clients-3.0.5.35
>
> I have argus deployed as a virtual machine on an Endace 10G probe capturing data from a vdag interface.
> I have the latest dag modules installed as well as the latest lib_pcap. It's capturing about 2Gbps of bidirectional traffic.
>
> Everything seems to work great when I'm using the normal ra* tools except for the ratop command for an extended period. It works but appears to slow down quickly and seems to stop at some point and then from that point on, argus won't accept any new connections. There's no error in the logs or really any indicator of what the problem is.
>
> if I start ratop (from another machine) and quit within 5-6 seconds I have no issues... If I leave it run longer than that, it eventually hangs (argus not ratop) and the only thing I can do is kill -9 argus to get it to stop.
>
> Anything I can do to debug and provide more information?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4367 bytes
Desc: not available
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20120319/09f6fa8f/attachment.bin>
More information about the argus
mailing list