My hourly argus data files from time to time freeze up anyra tools that touch them
carter at qosient.com
carter at qosient.com
Fri Sep 24 09:08:15 EDT 2010
Hey Kevin,
This definitely looks like file corruption, and many times the cause turns out to be multiple writers into the same file. I'm sure you are checking, but make sure you don't have more than one rasplit() running.
How often does this happen?
If you send a sample file, that would be helpful. We have a lot of file error recovery in the ra* programs already, so hopefully we can get the ra* programs to not freeze.
Carter
Sent from my Verizon Wireless BlackBerry
-----Original Message-----
From: The Branches <branchbunch at gmail.com>
Sender: argus-info-bounces+carter=qosient.com at lists.andrew.cmu.edu
Date: Thu, 23 Sep 2010 15:43:08
To: <argus-info at lists.andrew.cmu.edu>
Subject: [ARGUS] My hourly argus data files from time to time freeze up any
ra tools that touch them
Carter,
This issue has been happening to me for some time on several different
hosts running argus, and I keep on upgrading to the latest dev version
of argus and argus-clients in hopes of fixing it that way. I'm using
argus-3.0.3.16 and argus-clients-3.0.3.17 presently and I just had
another freeze. I have racluster and racount operations running every
few minutes which start piling up and bogging down the server until I
manually kill them off. I presume some kind of traffic is resulting in
a corrupt argus data record that ra tools choke on, though that's only a
guess. Any thoughts you might have on this issue would be most
welcome. I could probably provide a sample argus data file if you like.
The systems are CentOS 5.5, 32 and 64 bit.
Argus runs like this
argus -i eth0 -F /opt/nids/sensor/etc/argus.conf -P 561
and the data is split into hourly files like this
rasplit -X -S 127.0.0.1:561 -M time 1h -w /argus/%m/%d/eth0-%H.arg -d
Today the 1pm file (eth0-13.arg) was somehow left in a state my ra tools
can't handle. For example, if I run this
ra -X -r /argus/09/23/eth0-13.arg -nn
I get about one screen-full of output and then it freezes (CTRL-C works
to get out). The last output record to be printed to the screen is only
a few seconds after the start of the hour (13:00:09.755082). I also
tried shutting down all argus daemons and running the ra command again
to see if some wierd file locking issue was behind it, but it locked up
the same. I confirmed with lsof that the data file in question was not
being interacted with by any other programs.
I'd sure like to get this one licked. Maybe my whole approach needs
some refinement. I'm all ears.
Thanks!
Kevin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20100924/fe68547e/attachment.html>
More information about the argus
mailing list