Rasplit and crashes

carter at qosient.com carter at qosient.com
Wed Jun 24 18:22:34 EDT 2009


Hey Eric,
Sorry for the late response.
It looks like rasplit is getting data faster than it can write out to disk, and rather than throwing records away, its just closing the connection, retrying the connection, failing and then running into a bug.

Is the load on the machine reflecting this kind of behavior?  The bug is probably easy to fix, but dealing with the load is something we probably should work on.

If the disk can't keep up, what would be right!  Drop records?

Carter

Sent from my Verizon Wireless BlackBerry

-----Original Message-----
From: Eric Gustafson <subwire at gmail.com>

Date: Wed, 24 Jun 2009 09:31:36 
To: Argus<argus-info at lists.andrew.cmu.edu>
Subject: [ARGUS] Rasplit and crashes


Hey guys,
I've got another problem unrelated to Bivio, unfortunately.  We have been
trying to fully automate and organize our gathering of argus records using
rasplit, but have been seeing it "crash" after a seemingly random period of
time, where the process would just hang, giving only a minimal indication of
its dilema.  We've tried this both running rasplit locally on each sensor,
and running rasplit on the system with our storage array, and both seem to
have the same effect.

We don't have any clue why its doing this, given it isn't logging in very
much detail.
Here's what I see on one of the sensors when using rasplit to attach to a
local argus:
We start rasplit like this: /usr/local/bin/rasplit -d -S localhost -M time
1h -w /argus/%Y/%m/%d/argus.%Y.%m.%d.%H

Here's it starting up:
Jun 23 10:13:29 snort1 rasplit[20197]: 10:13:29.043344 started
Jun 23 10:13:29 snort1 argus[3557]: connect from 127.0.0.1
Here's it "crashing"
Jun 23 23:48:36 snort1 argus[3557]: ArgusWriteOutSocket(0x8865b68) Queue
Count 50001
Jun 23 23:48:50 snort1 argus[20201]: ArgusHandleClientData:
ArgusWriteOutSocket failed
When running remotely and streaming records across the network, rasplit
would log a "Connection refused" message before "crashing".

The fun part is, we're running the latest argus-clients, but the argus
sensors, which have been around for a long time are running argus 2.0.6.  My
co-workers are taking the totally understandable "if it ain't broke, don't
fix it" mentallity with these.  If this is likely the cause of the bug,
however, upgrading shouldn't be a problem.
The only other quirk to our setup is that we receive a huge (2.5gb+ of
uncompressed argus records / hour) amount of data.
What do you all think?

Thanks,
Eric

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20090624/6f350f66/attachment.html>


More information about the argus mailing list