little endian bug in 2.x record parsing in new clients
Jason Carr
jcarr at andrew.cmu.edu
Tue Mar 31 14:03:47 EDT 2009
I need to check the new beta clients and see if that works better. If
that doesn't work, I'll contact you off list about getting access to the
system(s) unless you have any better ideas.
On 3/30/09 9:11 AM, Carter Bullard wrote:
> Hey Jason,
> So, where are we? Are you working? Is there a chance for me to log
> onto this thing to figure it out?
>
> Carter
>
> On Mar 23, 2009, at 10:54 PM, Jason Carr wrote:
>
>> There is a call inside of the new argus-3.0 code that their older pcap
>> doesn't implement until their next OS release so I'm stuck on 2.6.
>>
>> On 3/23/09 8:06 PM, Carter Bullard wrote:
>>> Hey Jason,
>>> If argus-2.x could read the interface, argus-3.0 should be able to do it
>>> as well,
>>> since they both should be using the same libpcap library. If it can't
>>> open the
>>> interface, then maybe it linked to the wrong libpcap?
>>>
>>> argus-3.0 can open many interfaces at a time, the ARGUS_INTERFACE
>>> entries
>>> in the new ./support/Config/argus.conf file will guide you on what to
>>> do.
>>>
>>> Carter
>>>
>>> On Mar 23, 2009, at 1:55 PM, Jason Carr wrote:
>>>
>>>> Also, default uses all interfaces available to each machine. For
>>>> example, we have two 10G interfaces that have two halves of our core.
>>>> We were using the default interface for this reason. Since we can't do
>>>> that I think we can get argus to read two interfaces. If not, we'll
>>>> run two argoose (plural argus...) and hopefully rasplit can read
>>>> multiple streams. Any recommendations for that?
>>>>
>>>> On 3/23/09 12:54 PM, Carter Bullard wrote:
>>>>> Hey Jason,
>>>>> A few things you should change.
>>>>>
>>>>> All the things you have on the command line can be set in the
>>>>> argus.conf
>>>>> file, that would be a great place to set them, unless they change with
>>>>> every
>>>>> run.
>>>>>
>>>>> Don't use "-F /etc/argus.conf". Argus always reads the
>>>>> /etc/argus.conf, and
>>>>> by putting a "-F /etc/argus.conf" it is reading it twice, and thus
>>>>> setting the
>>>>> bind addresses twice, which means its opening twice, etc.... This
>>>>> is not
>>>>> good.
>>>>>
>>>>> You have the ARGUS_MONITOR_ID set in the argus.conf file, but you are
>>>>> changing it on the command line, and then you re-read the argus.conf
>>>>> file.
>>>>> What MONITOR_ID is actually being used would be interesting to know.
>>>>>
>>>>> I have no idea what "-i default" will do. Do you have a /dev/default
>>>>> device?
>>>>>
>>>>>
>>>>> On Mar 23, 2009, at 11:47 AM, Jason Carr wrote:
>>>>>
>>>>>> For argus:
>>>>>>
>>>>>> /usr/sbin/argus -P 561 -B 10.10.10.103 -e CPU-2c1 -i default -F
>>>>>> /etc/argus.conf
>>>>>>
>>>>>> argus.conf:
>>>>>> ARGUS_DAEMON=no
>>>>>> ARGUS_MAX_INSTANCES=1
>>>>>> ARGUS_SET_PID=no
>>>>>> ARGUS_PID_FILENAME=/var/run/argus.pid
>>>>>> ARGUS_MONITOR_ID=`hostname`
>>>>>> ARGUS_ACCESS_PORT=561
>>>>>> ARGUS_BIND_IP="127.0.0.1"
>>>>>> ARGUS_GO_PROMISCUOUS=yes
>>>>>> ARGUS_FLOW_STATUS_INTERVAL=60
>>>>>> ARGUS_MAR_STATUS_INTERVAL=300
>>>>>> ARGUS_GENERATE_RESPONSE_TIME_DATA=no
>>>>>> ARGUS_GENERATE_JITTER_DATA=no
>>>>>> ARGUS_GENERATE_MAC_DATA=no
>>>>>> ARGUS_CAPTURE_DATA_LEN=128
>>>>>> ARGUS_FILTER_OPTIMIZER=yes
>>>>>> ARGUS_FILTER=""
>>>>>>
>>>>>> argus runs on an internal IP address 10.10.10.103. The connection
>>>>>> from
>>>>>> the client machine running rasplit comes from an external IP (lets
>>>>>> call that argus-sensor) port forwarded to 10.10.10.103. Lets just
>>>>>> call
>>>>>> it argus-collector. argus-collector has the following configuration:
>>>>>>
>>>>>> ./rasplit -S argus-sensor:561 -M time 5m -w
>>>>>> /data/argus/core/%Y/%m/%d/%H/core.%Y.%m.%d.%H.%M.%S.argus
>>>>>
>>>>>
>>>>> The best way to talk about this is that argus "binds" to 10.10.10.103.
>>>>> That is much different than
>>>>> "argus runs on 10.10.10.103", especially in multi-processor/multi-core
>>>>> environments.
>>>>>
>>>>> I'm concerned about the port forwarding, why not just bind argus to
>>>>> the
>>>>> 172.19.41.41 address?
>>>>> I think when you are having problems, that all this complexity is just
>>>>> going
>>>>> to be in the way?
>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> So essentially, we're doing this:
>>>>>>
>>>>>>
>>>>>> +---------------+ +---------------+
>>>>>> | 10.10.10.103 |<->| 10.10.10.1 |
>>>>>> | argus TCP/561 | | port forward |
>>>>>> +---------------+ +---------------+
>>>>>> ^
>>>>>> | tcp
>>>>>> \/ 561
>>>>>> +---------------+ +---------------+
>>>>>> | 172.19.41.42 |<->| 172.19.41.41 |
>>>>>> | rasplit | | port forward |
>>>>>> +---------------+ +---------------+
>>>>>>
>>>>>> 172.19.41.41, 10.10.10.1, 10.10.10.103 are all the same physical unit
>>>>>> and are all PowerPC arcitecture. 10.10.10.1 and 172.19.41.41 is the
>>>>>> same machine.
>>>>>>
>>>>>> 172.19.41.42 is an x86_64 machine.
>>>>>>
>>>>>>
>>>>>> Hopefully this makes sense. If it doesn't let me know and I'll figure
>>>>>> something else out :)
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 3/23/09 9:56 AM, Carter Bullard wrote:
>>>>>>> Hey Jason,
>>>>>>> You are printing the suser and duser fields (not sdata and ddata).
>>>>>>> The spacing is there because you are specifying the length to be
>>>>>>> 128, so its 128. If you want to parse the strings, change the field
>>>>>>> separator, using the "-c <char>" option (see the ra.1 man page).
>>>>>>>
>>>>>>> Regardless, depending on the encoding of the data buffer, we
>>>>>>> should be 'quoting' the strings if you're printing in ascii, so I'll
>>>>>>> take
>>>>>>> a look at that.
>>>>>>>
>>>>>>> The debug messages you sent indicate that the Bivio is closing
>>>>>>> the connection:
>>>>>>>
>>>>>>>>> radium[29333.e096c4076b7f0000]: 12:45:44.242288 ArgusHandleDatum
>>>>>>>>> (0xadf1b0, 0x7c26a38) received closing Mar
>>>>>>>
>>>>>>> But, if radium() is crashing after that, then that is not good. It
>>>>>>> should recoverso I'll look into the memory problem.
>>>>>>>
>>>>>>> The kind of problems that you are describing, (other than the
>>>>>>> crashes)
>>>>>>> suggest that you may not have your system architected the way
>>>>>>> you would like. The only way to understand why you are having
>>>>>>> problems, is for you to completely describe what you have done.
>>>>>>> argus() running on X listening on port abcde with these options
>>>>>>> (argus.conf)
>>>>>>> radium() running on Y attaching to argus() on system X:port abcde,
>>>>>>> listening
>>>>>>> on port vwxyz, with these options (radium.conf). Clients running
>>>>>>> on Z
>>>>>>> attaching to radium on port vwyxz (etc).
>>>>>>>
>>>>>>> I need to see all command line options, to make sure that those are
>>>>>>> correct.
>>>>>>>
>>>>>>> When there is a problem, such as in this case, what syslog
>>>>>>> error/messages
>>>>>>> are generated by each component, etc....., need to be checked.
>>>>>>>
>>>>>>> And, if all is lost, if I can access the system, I can usually
>>>>>>> get it
>>>>>>> going
>>>>>>> in short time.
>>>>>>>
>>>>>>>
>>>>>>> Sorry for the delayed response,
>>>>>>>
>>>>>>> Carter
>>>>>>>
>>>>>>> On Mar 19, 2009, at 4:15 PM, Jason Carr wrote:
>>>>>>>
>>>>>>>> Now that I'm directly connecting to the argus server from the
>>>>>>>> client,
>>>>>>>> rasplit from 3.0.2 works just fine. 3.0.2.beta.2 crashes.
>>>>>>>>
>>>>>>>> Does it take awhile for rasplit to close the file after it's been
>>>>>>>> completely written to? I've had times were ra only will read ~800
>>>>>>>> flows and exit because there was an error:
>>>>>>>>
>>>>>>>> ra[1576]: 09:29:10.582079 ArgusReadStreamSocket (0xf2b0) record
>>>>>>>> length
>>>>>>>> is zero
>>>>>>>>
>>>>>>>> With the newer clients, we can't use the -s user -d s128:d128 like
>>>>>>>> before. Now I'm using -s ddata:128 -s sdata:128. This is
>>>>>>>> interesting
>>>>>>>> because it looks like it still leaves 128 characters of space
>>>>>>>> for the
>>>>>>>> source even if the source didn't need it. I also noticed that the
>>>>>>>> data
>>>>>>>> fields don't have quotes around them any more. Can these be
>>>>>>>> enabled/disabled (maybe I had them enabled on my older box)?
>>>>>>>>
>>>>>>>> - Jason
>>>>>>>>
>>>>>>>> On 3/18/09 12:54 PM, Jason Carr wrote:
>>>>>>>>> Connecting directly yields this error after a minute or so of
>>>>>>>>> running:
>>>>>>>>>
>>>>>>>>> radium[29333.e096c4076b7f0000]: 12:45:44.241519
>>>>>>>>> ArgusReadStreamSocket
>>>>>>>>> (0x7ba4010) read 1312 bytes
>>>>>>>>> radium[29333.e096c4076b7f0000]: 12:45:44.241643
>>>>>>>>> ArgusReadStreamSocket
>>>>>>>>> (0x7ba4010) read 1448 bytes
>>>>>>>>> radium[29333.e096c4076b7f0000]: 12:45:44.241778
>>>>>>>>> ArgusReadStreamSocket
>>>>>>>>> (0x7ba4010) read 2088 bytes
>>>>>>>>> radium[29333.e096c4076b7f0000]: 12:45:44.241933
>>>>>>>>> ArgusReadStreamSocket
>>>>>>>>> (0x7ba4010) read 1448 bytes
>>>>>>>>> radium[29333.e096c4076b7f0000]: 12:45:44.242006
>>>>>>>>> ArgusReadStreamSocket
>>>>>>>>> (0x7ba4010) read 880 bytes
>>>>>>>>> radium[29333.e096c4076b7f0000]: 12:45:44.242168
>>>>>>>>> ArgusReadStreamSocket
>>>>>>>>> (0x7ba4010) read 1448 bytes
>>>>>>>>> radium[29333.e096c4076b7f0000]: 12:45:44.242243
>>>>>>>>> ArgusReadStreamSocket
>>>>>>>>> (0x7ba4010) read 1128 bytes
>>>>>>>>> radium[29333.e096c4076b7f0000]: 12:45:44.242263
>>>>>>>>> ArgusCopyRecordStruct
>>>>>>>>> (0x7ba4600) retn 0xf8007710
>>>>>>>>> radium[29333.e096c4076b7f0000]: 12:45:44.242277 RaProcessRecord
>>>>>>>>> (0x7c06010, 0x7ba4600) returning
>>>>>>>>> radium[29333.e096c4076b7f0000]: 12:45:44.242288 ArgusHandleDatum
>>>>>>>>> (0xadf1b0, 0x7c26a38) received closing Mar
>>>>>>>>> radium[29333.e096c4076b7f0000]: 12:45:44.242300
>>>>>>>>> ArgusCloseInput(0x7ba4010) closing
>>>>>>>>> radium[29333.e096c4076b7f0000]: 12:45:44.242310
>>>>>>>>> ArgusWriteConnection:
>>>>>>>>> write(6, 0x49d7a9, 6)
>>>>>>>>> radium[29333.e096c4076b7f0000]: 12:45:44.242330
>>>>>>>>> ArgusWriteConnection(0x7ba4010, 0x49d7a9, 6) returning 6
>>>>>>>>> radium[29333.5019cf4100000000]: 12:45:44.242368
>>>>>>>>> ArgusOutputProcess()
>>>>>>>>> received mar 0xf8007710 totals 129200 count 0 remaining 0
>>>>>>>>> radium[29333.e096c4076b7f0000]: 12:45:44.242381
>>>>>>>>> ArgusCloseInput(0x7ba4010) done
>>>>>>>>> radium[29333.5019cf4100000000]: 12:45:44.242385
>>>>>>>>> ArgusCopyRecordStruct
>>>>>>>>> (0xf8007710) retn 0xb2aa60
>>>>>>>>> radium[29333.5019cf4100000000]: 12:45:44.242448
>>>>>>>>> ArgusWriteOutSocket
>>>>>>>>> (0xacd0a0, 0xacec40) 0 records waiting. returning 128
>>>>>>>>> radium[29333.50d9d15700000000]: 12:45:44.932325
>>>>>>>>> ArgusConnectRemote(0x7ba4010) starting
>>>>>>>>> radium[29333.50d9d15700000000]: 12:45:44.932362 Trying
>>>>>>>>> 172.19.41.41
>>>>>>>>> port
>>>>>>>>> 561 Expecting Argus records
>>>>>>>>> radium[29333.50d9d15700000000]: 12:45:44.932659 connected
>>>>>>>>> radium[29333.50d9d15700000000]: 12:45:44.932675
>>>>>>>>> ArgusGetServerSocket
>>>>>>>>> (0x7ba4010) returning 6
>>>>>>>>> radium[29333.50d9d15700000000]: 12:45:44.944349
>>>>>>>>> ArgusReadConnection()
>>>>>>>>> read 16 bytes
>>>>>>>>> *** glibc detected *** radium: double free or corruption (out):
>>>>>>>>> 0x0000000000b3e900 ***
>>>>>>>>> ======= Backtrace: =========
>>>>>>>>> /lib/libc.so.6[0x7f6b0709508a]
>>>>>>>>> /lib/libc.so.6(cfree+0x8c)[0x7f6b07098c1c]
>>>>>>>>> radium[0x44c8d9]
>>>>>>>>> radium[0x4579c8]
>>>>>>>>> radium[0x453511]
>>>>>>>>> radium[0x4602e0]
>>>>>>>>> /lib/libpthread.so.0[0x7f6b073893f7]
>>>>>>>>> /lib/libc.so.6(clone+0x6d)[0x7f6b070f8b3d]
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 3/18/09 1:50 AM, Jason Carr wrote:
>>>>>>>>>> The Bivio unit has 12 cores, each has a, for a lack of a better
>>>>>>>>>> phrase,
>>>>>>>>>> virtual machine, that is connected to the head node via an
>>>>>>>>>> internal
>>>>>>>>>> network. I just added an iptables rule to forward the connection
>>>>>>>>>> from
>>>>>>>>>> radium on the x86_64 box to the Bivio argus server.
>>>>>>>>>>
>>>>>>>>>> If that works long term I'll try out the various rasplit and
>>>>>>>>>> rastream
>>>>>>>>>> stuff and let you know how it turns out.
>>>>>>>>>>
>>>>>>>>>> Thanks again,
>>>>>>>>>>
>>>>>>>>>> - Jason
>>>>>>>>>>
>>>>>>>>>> On 3/16/09 2:31 PM, Carter Bullard wrote:
>>>>>>>>>>> Jason,
>>>>>>>>>>> Very confusing, why do you run client software on the bivio?
>>>>>>>>>>> Cater
>>>>>>>>>>>
>>>>>>>>>>> On Mar 16, 2009, at 2:02 PM, Jason Carr wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Yep, sorry I meant -S host:port.
>>>>>>>>>>>>
>>>>>>>>>>>> What I'm trying to do is to store the incoming data stream on
>>>>>>>>>>>> disk.
>>>>>>>>>>>> Currently we have the files up into the directory structure
>>>>>>>>>>>> like
>>>>>>>>>>>> this:
>>>>>>>>>>>>
>>>>>>>>>>>> /data/argus/core/%Y/%m/%d/argus.%Y.%m.%d.%H.%M.%S
>>>>>>>>>>>>
>>>>>>>>>>>> And split it up every 5 minutes. So we get 12 5 minute files in
>>>>>>>>>>>> each
>>>>>>>>>>>> hour. This works out well for us since they are much smaller
>>>>>>>>>>>> files
>>>>>>>>>>>> than running through an hour or days worth of data.
>>>>>>>>>>>>
>>>>>>>>>>>> -D 4 output is this on radium:
>>>>>>>>>>>>
>>>>>>>>>>>> *a ton of other lines*
>>>>>>>>>>>> radium[19615.50999a4000000000]: 13:34:16.865875
>>>>>>>>>>>> ArgusWriteOutSocket
>>>>>>>>>>>> (0xacd0a0, 0xacec40) 0 records waiting. returning 328
>>>>>>>>>>>> radium[19615.e096fd4f7e7f0000]: 13:34:16.866297
>>>>>>>>>>>> ArgusReadStreamSocket
>>>>>>>>>>>> (0x4ff34010) read 400 bytes
>>>>>>>>>>>> radium[19615.e096fd4f7e7f0000]: 13:34:16.866321
>>>>>>>>>>>> ArgusCopyRecordStruct
>>>>>>>>>>>> (0x4ff34600) retn 0x4801f310
>>>>>>>>>>>> radium[19615.e096fd4f7e7f0000]: 13:34:16.866336 RaProcessRecord
>>>>>>>>>>>> (0x4ff96010, 0x4ff34600) returning
>>>>>>>>>>>> radium[19615.50999a4000000000]: 13:34:16.866373
>>>>>>>>>>>> ArgusCopyRecordStruct
>>>>>>>>>>>> (0x4801f310) retn 0xb06310
>>>>>>>>>>>> radium[19615.50999a4000000000]: 13:34:16.866392
>>>>>>>>>>>> ArgusGenerateRecord
>>>>>>>>>>>> (0xb06310, 0) old len 400 new len 380
>>>>>>>>>>>> radium[19615.50999a4000000000]: 13:34:16.866411
>>>>>>>>>>>> ArgusWriteOutSocket
>>>>>>>>>>>> (0xacd0a0, 0xacec40) 0 records waiting. returning 400
>>>>>>>>>>>> radium[19615.e096fd4f7e7f0000]: 13:34:16.868431
>>>>>>>>>>>> ArgusReadStreamSocket
>>>>>>>>>>>> (0x4ff34010) read 100 bytes
>>>>>>>>>>>>
>>>>>>>>>>>> I noticed that the file wasn't moved when that happened. I'm
>>>>>>>>>>>> starting
>>>>>>>>>>>> to think it has nothing to do with the file moving.
>>>>>>>>>>>>
>>>>>>>>>>>> I upgraded the bivio software from "argus-clients-3.0.2" to
>>>>>>>>>>>> "argus-clients-3.0.2.beta.2" and it's consistently crashing
>>>>>>>>>>>> now.
>>>>>>>>>>>> 3.0.2
>>>>>>>>>>>> never crashed previously.
>>>>>>>>>>>>
>>>>>>>>>>>> *** glibc detected *** radium: double free or corruption (out):
>>>>>>>>>>>> 0x3350a008 ***
>>>>>>>>>>>> ======= Backtrace: =========
>>>>>>>>>>>> /lib/libc.so.6[0xfd4a964]
>>>>>>>>>>>> /lib/libc.so.6(__libc_free+0xc8)[0xfd4ded8]
>>>>>>>>>>>> radium[0x1004d500]
>>>>>>>>>>>> radium[0x10055ac4]
>>>>>>>>>>>> radium[0x10062ebc]
>>>>>>>>>>>> /lib/libpthread.so.0[0xfe55994]
>>>>>>>>>>>> /lib/libc.so.6(__clone+0x84)[0xfdb7794]
>>>>>>>>>>>>
>>>>>>>>>>>> - Jason
>>>>>>>>>>>>
>>>>>>>>>>>> On 3/16/09 12:50 PM, Carter Bullard wrote:
>>>>>>>>>>>>> Hey Jason,
>>>>>>>>>>>>> Its not "-R host:port" its "-S host:port". I'm sure that was a
>>>>>>>>>>>>> slip?
>>>>>>>>>>>>> So things look like they are working for rasplit() then, and
>>>>>>>>>>>>> with
>>>>>>>>>>>>> your Bivio box.
>>>>>>>>>>>>>
>>>>>>>>>>>>> So I cannot replicate your problem on any machine, and I've
>>>>>>>>>>>>> tested it on 8 different machines now. I do not recommend
>>>>>>>>>>>>> radium writing out to a file when there are so many other ways
>>>>>>>>>>>>> to manage the data streams. Why are you doing this? or rather
>>>>>>>>>>>>> What are you trying to do?
>>>>>>>>>>>>>
>>>>>>>>>>>>> There are a lot of ways to misconfigure a radium. Having it
>>>>>>>>>>>>> connect to itself is the common bad thing, and we try to catch
>>>>>>>>>>>>> this, but there are clever ways of getting around that ;o)
>>>>>>>>>>>>>
>>>>>>>>>>>>> Usually radium doesn't update a moved file, because there
>>>>>>>>>>>>> are no new records coming in. It doesn't know to do anything
>>>>>>>>>>>>> until a data record shows up, so more than likely your server
>>>>>>>>>>>>> connection is poorly configured.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The best way to see what's going on, is to run with something
>>>>>>>>>>>>> like "-D 4" to see the debug information, to assure that you
>>>>>>>>>>>>> are attaching properly, that data is coming in, and that it
>>>>>>>>>>>>> see's that the file is gone.
>>>>>>>>>>>>>
>>>>>>>>>>>>> So run with -D 4 and then delete the file, and see what it
>>>>>>>>>>>>> sez.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Carter
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mar 16, 2009, at 11:36 AM, Jason Carr wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Carter,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> For me, rasplit only wrote one file for 5 minutes after
>>>>>>>>>>>>>> giving
>>>>>>>>>>>>>> rasplit
>>>>>>>>>>>>>> the -R x.x.x.x:561 option. It doesn't exit after the 5
>>>>>>>>>>>>>> minutes
>>>>>>>>>>>>>> anymore, not sure why though. Is there a way to do that?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I still have the issue of radium not letting go of the file
>>>>>>>>>>>>>> when
>>>>>>>>>>>>>> it's
>>>>>>>>>>>>>> been moved.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> - Jason
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 3/12/09 12:51 PM, Carter Bullard wrote:
>>>>>>>>>>>>>>> Hey Jason,
>>>>>>>>>>>>>>> Hmmmm, so I can't replicate your problem here on any
>>>>>>>>>>>>>>> machine.
>>>>>>>>>>>>>>> If you can run rasplit() under gdb, (and you can, of course
>>>>>>>>>>>>>>> generate
>>>>>>>>>>>>>>> files with a shorter granularity), maybe you will catch the
>>>>>>>>>>>>>>> rasplit()
>>>>>>>>>>>>>>> exiting. Be sure and ./configure and compile with these tag
>>>>>>>>>>>>>>> files
>>>>>>>>>>>>>>> in the root directory:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> % touch .devel .debug
>>>>>>>>>>>>>>> % ./configure;make clean;make
>>>>>>>>>>>>>>> % gdb bin/rasplit
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> For rasplit() do you have write permission to the entire
>>>>>>>>>>>>>>> directory
>>>>>>>>>>>>>>> structure that you are trying to write to?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> For rastream(), you do need the $srcid in the path, at least
>>>>>>>>>>>>>>> right
>>>>>>>>>>>>>>> now
>>>>>>>>>>>>>>> you do, so you should add it.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> rastream ....... -w
>>>>>>>>>>>>>>> /data/argus/core/\$srcid/%Y/%m/%d/argus.........
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Carter
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mar 12, 2009, at 1:38 AM, Jason Carr wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I'm running all of this on a x86_64 Linux box.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> When I run rasplit like this:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> # rasplit -S argus-source:561 -M time 5m -w
>>>>>>>>>>>>>>>> /data/argus/core/%Y/%m/%d/argus.%Y.%m.%d.%H.%M.%S
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> It only creates one 5 minute file and then exits. Can I
>>>>>>>>>>>>>>>> make it
>>>>>>>>>>>>>>>> just
>>>>>>>>>>>>>>>> keep listening and keep on writing the 5 minute files?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> rastream segfaults when I run it:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> # rastream -S argus-source:561 -M time 5m -w
>>>>>>>>>>>>>>>> '/data/argus/core/%Y/%m/%d/argus.%Y.%m.%d.%H.%M.%S' -B
>>>>>>>>>>>>>>>> 15s -f
>>>>>>>>>>>>>>>> ~/argus/test.sh
>>>>>>>>>>>>>>>> rastream[5979]: 01:34:40.205530 output string requires
>>>>>>>>>>>>>>>> $srcid
>>>>>>>>>>>>>>>> Segmentation fault
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> We've historically always used radium to write to a file
>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>> move
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> file every 5 minutes. radium has always just recreates the
>>>>>>>>>>>>>>>> output
>>>>>>>>>>>>>>>> file
>>>>>>>>>>>>>>>> once it's been moved. It is a little bit of a hack and if
>>>>>>>>>>>>>>>> rastream/rasplit already does it, that would be awesome.
>>>>>>>>>>>>>>>> I'll use
>>>>>>>>>>>>>>>> either at this point :)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> - Jason
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Carter Bullard wrote:
>>>>>>>>>>>>>>>>> Hey Jason,
>>>>>>>>>>>>>>>>> I'm answering on the mailing list, so we capture the
>>>>>>>>>>>>>>>>> thread.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Glad to hear that your argus-2.x data on your Bivio box is
>>>>>>>>>>>>>>>>> working well for you. Sorry about your radium() file
>>>>>>>>>>>>>>>>> behavior.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I still can't duplicate your error, but I'll try again
>>>>>>>>>>>>>>>>> later
>>>>>>>>>>>>>>>>> today.
>>>>>>>>>>>>>>>>> What kind of machine is your radium() running on?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> To get rasplit to generate the files run:
>>>>>>>>>>>>>>>>> rasplit -S radium -M time 5m \
>>>>>>>>>>>>>>>>> -w /data/argus/core/%Y/%m/%d/argus.%Y.%m.%d.%H.%M.%S
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> you can run it as a daemon with the "-d" option. Some
>>>>>>>>>>>>>>>>> like to
>>>>>>>>>>>>>>>>> run
>>>>>>>>>>>>>>>>> it from a script that respawns it if it dies, but its
>>>>>>>>>>>>>>>>> been a
>>>>>>>>>>>>>>>>> really
>>>>>>>>>>>>>>>>> stable
>>>>>>>>>>>>>>>>> program.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The reason you want to use rasplit(), is that it will
>>>>>>>>>>>>>>>>> actually
>>>>>>>>>>>>>>>>> clip
>>>>>>>>>>>>>>>>> argus records to ensure that all the argus data for a
>>>>>>>>>>>>>>>>> particular
>>>>>>>>>>>>>>>>> time
>>>>>>>>>>>>>>>>> frame is only in the specific timed file. So if a record
>>>>>>>>>>>>>>>>> starts at
>>>>>>>>>>>>>>>>> 12:04:12.345127 and ends at 12:06:12.363131, rasplit will
>>>>>>>>>>>>>>>>> clip
>>>>>>>>>>>>>>>>> that record into 2 records with timestamps:
>>>>>>>>>>>>>>>>> 12:04:12.345127 - 12:05:00.000000
>>>>>>>>>>>>>>>>> 12:05:00.000000 - 12:06:12.363131
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> This makes it so you can process, graph the files, and you
>>>>>>>>>>>>>>>>> don't
>>>>>>>>>>>>>>>>> have to worry about whether you've got all the data or
>>>>>>>>>>>>>>>>> not.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Argus does a good job keeping its output records in time
>>>>>>>>>>>>>>>>> order, but if you're processing multiple streams of data,
>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> time order can get a bit skewed. So that when you pull the
>>>>>>>>>>>>>>>>> file
>>>>>>>>>>>>>>>>> out from underneath radium() you can still get records
>>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>> have data that should be in another file.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> To compress the data, have a cron job compress the
>>>>>>>>>>>>>>>>> files say
>>>>>>>>>>>>>>>>> an hour later.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> If you want to experiment, use rastream(), which is
>>>>>>>>>>>>>>>>> rasplit()
>>>>>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>>> the addition feature to run a shell script against files
>>>>>>>>>>>>>>>>> after
>>>>>>>>>>>>>>>>> they
>>>>>>>>>>>>>>>>> close.
>>>>>>>>>>>>>>>>> You need to tell rastream() how long to wait before
>>>>>>>>>>>>>>>>> running
>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> script,
>>>>>>>>>>>>>>>>> so you know that all the records for a particular time
>>>>>>>>>>>>>>>>> period
>>>>>>>>>>>>>>>>> are
>>>>>>>>>>>>>>>>> "in".
>>>>>>>>>>>>>>>>> If you are collecting netflow records, that time period
>>>>>>>>>>>>>>>>> could be
>>>>>>>>>>>>>>>>> hours
>>>>>>>>>>>>>>>>> and in some cases days (so don't use rastream for netflow
>>>>>>>>>>>>>>>>> data).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> rastream -S radium -M time 5m \
>>>>>>>>>>>>>>>>> -w /data/argus/core/%Y/%m/%d/argus.%Y.%m.%d.%H.%M.%S \
>>>>>>>>>>>>>>>>> -B 15s -f /usr/local/bin/rastream.sh
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> So the "-B 15s" option indicates that rastream() should
>>>>>>>>>>>>>>>>> close
>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>> process its files 15 seconds after a 5 minute boundary is
>>>>>>>>>>>>>>>>> passed.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The sample rastream.sh in ./support/Config compresses
>>>>>>>>>>>>>>>>> files. You
>>>>>>>>>>>>>>>>> can do anything you like in there, but do test ;o)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The shell script will be passed a single parameter, the
>>>>>>>>>>>>>>>>> full
>>>>>>>>>>>>>>>>> pathname
>>>>>>>>>>>>>>>>> of the file that is being closed.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I have not done extensive testing on this specific
>>>>>>>>>>>>>>>>> version of
>>>>>>>>>>>>>>>>> rastream(),
>>>>>>>>>>>>>>>>> but something like it has been running for years in a
>>>>>>>>>>>>>>>>> number of
>>>>>>>>>>>>>>>>> sites,
>>>>>>>>>>>>>>>>> so please give it a try, and if you have any problems,
>>>>>>>>>>>>>>>>> holler!!!
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Carter
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Mar 11, 2009, at 12:18 AM, Jason Carr wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Pretty simple file:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> RADIUM_DAEMON=no
>>>>>>>>>>>>>>>>>> RADIUM_MONITOR_ID=4
>>>>>>>>>>>>>>>>>> RADIUM_MAR_STATUS_INTERVAL=60
>>>>>>>>>>>>>>>>>> RADIUM_ARGUS_SERVER=bivio-01.iso.cmu.local:561
>>>>>>>>>>>>>>>>>> RADIUM_ACCESS_PORT=561
>>>>>>>>>>>>>>>>>> RADIUM_BIND_IP=127.0.0.1
>>>>>>>>>>>>>>>>>> RADIUM_OUTPUT_FILE=/data/argus/var/core.out
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> radium is ran like this: ./radium
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Does rasplit split out files into a directory structure
>>>>>>>>>>>>>>>>>> like
>>>>>>>>>>>>>>>>>> this:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> /data/argus/core/2009/03/03/22/
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> With 5 minute files that are gzipped inside of that
>>>>>>>>>>>>>>>>>> directory
>>>>>>>>>>>>>>>>>> for the
>>>>>>>>>>>>>>>>>> entire hour.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Carter Bullard wrote:
>>>>>>>>>>>>>>>>>>> Hey Jason,
>>>>>>>>>>>>>>>>>>> So I can't reproduce your situation on any of my
>>>>>>>>>>>>>>>>>>> machines
>>>>>>>>>>>>>>>>>>> here.
>>>>>>>>>>>>>>>>>>> How are you running radium()? Do you have a
>>>>>>>>>>>>>>>>>>> /etc/radium.conf
>>>>>>>>>>>>>>>>>>> file?
>>>>>>>>>>>>>>>>>>> If so can you send it?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Carter
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Mar 10, 2009, at 1:21 PM, Jason Carr wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Good news, it seemed to have fixed my problem, thank
>>>>>>>>>>>>>>>>>>>> you
>>>>>>>>>>>>>>>>>>>> very
>>>>>>>>>>>>>>>>>>>> much.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Bad news, there seems to be a problem with writing
>>>>>>>>>>>>>>>>>>>> to the
>>>>>>>>>>>>>>>>>>>> output
>>>>>>>>>>>>>>>>>>>> file.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Did the way files are written to change in this
>>>>>>>>>>>>>>>>>>>> version?
>>>>>>>>>>>>>>>>>>>> Right
>>>>>>>>>>>>>>>>>>>> now
>>>>>>>>>>>>>>>>>>>> we're telling radium to write to one file and then
>>>>>>>>>>>>>>>>>>>> every 5
>>>>>>>>>>>>>>>>>>>> minutes
>>>>>>>>>>>>>>>>>>>> move
>>>>>>>>>>>>>>>>>>>> the file. Previously, radium would recreate the file
>>>>>>>>>>>>>>>>>>>> but
>>>>>>>>>>>>>>>>>>>> now it
>>>>>>>>>>>>>>>>>>>> does
>>>>>>>>>>>>>>>>>>>> not and lsof showed that the file was still open by
>>>>>>>>>>>>>>>>>>>> radium.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> - Jason
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Carter Bullard wrote:
>>>>>>>>>>>>>>>>>>>>> There is a good chance that it will. But the Bivio
>>>>>>>>>>>>>>>>>>>>> part is
>>>>>>>>>>>>>>>>>>>>> still an
>>>>>>>>>>>>>>>>>>>>> unknown, so if it doesn't we'll still have work to do.
>>>>>>>>>>>>>>>>>>>>> Carter
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Mar 6, 2009, at 1:48 PM, Jason Carr wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Carter,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Do you think that this would be a fix for my problem?
>>>>>>>>>>>>>>>>>>>>>> I'll
>>>>>>>>>>>>>>>>>>>>>> try it
>>>>>>>>>>>>>>>>>>>>>> out
>>>>>>>>>>>>>>>>>>>>>> regardless... :)
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Jason
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Carter Bullard wrote:
>>>>>>>>>>>>>>>>>>>>>>> Gentle people,
>>>>>>>>>>>>>>>>>>>>>>> I have found a bug in the little-endian
>>>>>>>>>>>>>>>>>>>>>>> processing of
>>>>>>>>>>>>>>>>>>>>>>> 2.x
>>>>>>>>>>>>>>>>>>>>>>> data in
>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>> argus-clients-3.0.2 code, so I have uploaded a new
>>>>>>>>>>>>>>>>>>>>>>> argus-clients-3.0.2.beta.2.tar.gz file to the dev
>>>>>>>>>>>>>>>>>>>>>>> directory.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> ftp://qosient.com/dev/argus-3.0/argus-clients-3.0.2.beta.2.tar.gz
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I have tested reading data from as many argus files
>>>>>>>>>>>>>>>>>>>>>>> as I
>>>>>>>>>>>>>>>>>>>>>>> have,
>>>>>>>>>>>>>>>>>>>>>>> regardless
>>>>>>>>>>>>>>>>>>>>>>> of version, and it seems to work, ......
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I also fixed problems with racluster() results
>>>>>>>>>>>>>>>>>>>>>>> having
>>>>>>>>>>>>>>>>>>>>>>> short
>>>>>>>>>>>>>>>>>>>>>>> counts.
>>>>>>>>>>>>>>>>>>>>>>> I have gone through all the clients and revised
>>>>>>>>>>>>>>>>>>>>>>> their
>>>>>>>>>>>>>>>>>>>>>>> file
>>>>>>>>>>>>>>>>>>>>>>> closing
>>>>>>>>>>>>>>>>>>>>>>> strategies,
>>>>>>>>>>>>>>>>>>>>>>> and so ....
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> This does not fix the "-lz" issue for some
>>>>>>>>>>>>>>>>>>>>>>> ./configure
>>>>>>>>>>>>>>>>>>>>>>> runs.......
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> For those having problems, please grab this set of
>>>>>>>>>>>>>>>>>>>>>>> code
>>>>>>>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>>>>>>> testing.
>>>>>>>>>>>>>>>>>>>>>>> And don't hesitate to report any problems you
>>>>>>>>>>>>>>>>>>>>>>> find!!!!!
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Thanks!!!!!
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Carter
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Carter Bullard
>>>>>>>>>>> CEO/President
>>>>>>>>>>> QoSient, LLC
>>>>>>>>>>> 150 E 57th Street Suite 12D
>>>>>>>>>>> New York, New York 10022
>>>>>>>>>>>
>>>>>>>>>>> +1 212 588-9133 Phone
>>>>>>>>>>> +1 212 588-9134 Fax
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> Carter Bullard
>>>>>>> CEO/President
>>>>>>> QoSient, LLC
>>>>>>> 150 E 57th Street Suite 12D
>>>>>>> New York, New York 10022
>>>>>>>
>>>>>>> +1 212 588-9133 Phone
>>>>>>> +1 212 588-9134 Fax
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>> Carter Bullard
>>>>> CEO/President
>>>>> QoSient, LLC
>>>>> 150 E 57th Street Suite 12D
>>>>> New York, New York 10022
>>>>>
>>>>> +1 212 588-9133 Phone
>>>>> +1 212 588-9134 Fax
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>> Carter Bullard
>>> CEO/President
>>> QoSient, LLC
>>> 150 E 57th Street Suite 12D
>>> New York, New York 10022
>>>
>>> +1 212 588-9133 Phone
>>> +1 212 588-9134 Fax
>>>
>>>
>>>
>>>
>>
>
> Carter Bullard
> CEO/President
> QoSient, LLC
> 150 E 57th Street Suite 12D
> New York, New York 10022
>
> +1 212 588-9133 Phone
> +1 212 588-9134 Fax
>
>
>
>
More information about the argus
mailing list