My hourly argus data files from time to time freeze up any ra tools that touch them
Peter Van Epp
vanepp at sfu.ca
Thu Sep 23 17:17:30 EDT 2010
On Thu, Sep 23, 2010 at 03:43:08PM -0400, The Branches wrote:
> Carter,
>
> This issue has been happening to me for some time on several
> different hosts running argus, and I keep on upgrading to the latest
> dev version of argus and argus-clients in hopes of fixing it that
> way. I'm using argus-3.0.3.16 and argus-clients-3.0.3.17 presently
> and I just had another freeze. I have racluster and racount
> operations running every few minutes which start piling up and
> bogging down the server until I manually kill them off. I presume
> some kind of traffic is resulting in a corrupt argus data record
> that ra tools choke on, though that's only a guess. Any thoughts
> you might have on this issue would be most welcome. I could
> probably provide a sample argus data file if you like.
>
> The systems are CentOS 5.5, 32 and 64 bit.
>
> Argus runs like this
> argus -i eth0 -F /opt/nids/sensor/etc/argus.conf -P 561
> and the data is split into hourly files like this
> rasplit -X -S 127.0.0.1:561 -M time 1h -w /argus/%m/%d/eth0-%H.arg -d
It may be profitable to run an ra in parallel to see if the fault is
input data or rasplit. In parallel with the above run
ra -S 127.0.0.1:561 -w /argus/argus.out
and either move the argus.out file via a cron job (it will receate itself) or
just leave it to grow (assuming disk space is available :-)) til the failure
occurs. If ra can't read the rasplit version but can read the ra version then
its probably an rasplit problem. If both are corrupt then its likely argus.
>
> Today the 1pm file (eth0-13.arg) was somehow left in a state my ra
> tools can't handle. For example, if I run this
> ra -X -r /argus/09/23/eth0-13.arg -nn
Two suggestions: use a time filter in the ra command to isolate a
small section of the data file that causes the hang and send it to Carter
to have a look at (assuming you can release the data). Thats the easiest
method :-).
For do it yourself degugging, in the argus clients source directory do
touch .debug .devel
make clobber
make
this will enable debugging. now running
ra -X -r /argus/09/23/eth0-13.arg -nn -D 2
will generate debug messages to stdout and should indicate where in ra the
hang is occurring. Increasing the 2 gives more information. You are likely
looking for the point when the messages stop changing (indicating it is in
a loop doing something). Again volume can get large so using script to get a
file copy is likely a good bet.
How often do you get freezes? Assuming it is an argus fault (or a
corrupt data fault) it would be profitable (but expensive in both performance
and disk space :-)) to enable capturing the input records from pcap in argus
(there is a config file entry to do this). Since its writing on the sensor
machine it will have a large performance impact though.
Peter Van Epp
More information about the argus
mailing list