IPFIX support timeline.

Carter Bullard via Argus-info argus-info at lists.andrew.cmu.edu
Wed Feb 10 22:28:47 EST 2016


Hey Richard,
Thanks for providing the data files that have caused you problems.
It is clear why you are having problems, but I do need to fix the bug that is causing the fault.

You are reading in lots and lots of Cisco Netflow data, and trying to create time bins on the data streams.  This data, like most netflow data sources, is wildly out of order.  In the first 10000 records, the time fluctuates between 2013/08/11.10.00.00 and 2013/09/27.15.00.00.  In the next 10000 records, the time fluctuates between 2013/08/11.04.00.00 and 2013/09/27.15.00.00, because there are flows that have huge durations, some > 4M secs.

rabins.1, to perform best, wants the data to arrive in rough time order, and the durations need to be less than the hold buffer, so that when you break down the flow records into the bin size, there are slots for the data to go.

Argus guarantees that its output data stream is in start time order, and that the duration of the flow records is not greater than the ARGUS_FLOW_STATUS_INTERVAL.  With these constraints, a program like rabins.1 can work.

With netflow data it is better to use rasplit to break the stream out into files, and when you feel like you have passed the time range of future flow records, you can then process the output files.  If you think there are errors in your netflow data (flows with durations > 4M seconds), then if we can get the original netflow data, then we can compare, but argus has been pretty good at converting netflow v5 and v9.

Carter

> On Feb 4, 2016, at 9:45 PM, Richard Rothwell <Richard.Rothwell at aarnet.edu.au> wrote:
> 
> Hi Carter,
> 
> I have followed up on your suggestions. No luck. And the problem is broader than IPFIX handling.
> 
> Its seems radium can handle the net flow 9 records I am throwing at it.
> No problems there. The argus records output file produced by the –w option has sensible contents.
> 
> FYI I am currently using nfreplay to convert a collection of IPFIX records in files, to a NetFlow 9 network stream and sending that to radium.
> This produces a 1.6Gig Argus records file.
> 
> However rabins falls over whether it is taking records directly from radium or indirectly  via the Argus records file produced by radium.
> Adjusting the –B option to 300s causes rabins to fall over, but without producing any output at all.
> 
> The commands I am using are:
> 
> sudo /usr/local/sbin/radium -S cisco://any:9995 -d -P 562 
> With
> sudo /usr/local/bin/rabins -S localhost:562  -M time 10s -B 10s -w '/mnt/hgfs/centos_shared/rabins_radium.out’
> 
> OR 
> 
> sudo /usr/local/sbin/radium -S cisco://any:9995 -d -P 562 -w '/mnt/hgfs/centos_shad/radium_100_10s.out'
> With
> sudo /usr/local/bin/rabins -r '/mnt/hgfs/centos_shared/radium_100_10s.out' -M time 100s -B 100s –w '/mnt/hgfs/centos_shared/rabins_infile_100_100s_100s.out’
> 
> Etc
> 
> Regards
> 
> 
> 
> From: Carter Bullard <carter at qosient.com>
> Date: Friday, 5 February 2016 at 7:39 AM
> To: Site License <Richard.Rothwell at aarnet.edu.au>
> Cc: Argus <argus-info at lists.andrew.cmu.edu>
> Subject: Re: [ARGUS] IPFIX support timeline.
> 
> Hey Richard,
> We have preliminary support in argus-clients now for IPFIX UDP and TCP.  That needs debugging and additional support as new IEs are used.  It would be reasonable to read the IPFIX data with radium, and have rabins connect to radium to get the converted data.  That way we can figure out if any bugs are in IPFIX conversion or in record processing later on.
> 
> rabins.1 has some very specific issues with flow data coming way out of time order.  We’re going to report on time period t1-t2, and if IPFIX sends data late, rabins throws it away … could be the memory leak relates to data out of bounds ???  If so, you need to add a bit of buffering using the -B option, so that rabins doesn’t flush out a time bin, when more IPFIX data is coming.    With some implementations, you may need a “-B 300s” to make sure the data is ok.  But if you can get some guarantees from IPFIX, then the -B can be shorter.
> 
> If you have a bug report for rabins, please send it to the list.  Try using radium to convert IPFIX to argus format, check to see how out of order the flow records are, then adjust using a ‘-B delay’ option to give the IPFIX data time to show up, and then lets see if you have blow ups or memory leaks ????
> 
> Gloriad.org, an NSF IRE service provider, has a great argus -> ELK system they have said they will share.  Not sure the status of that.
> 
> Carter
> 
>> On Feb 4, 2016, at 12:20 AM, Richard Rothwell via Argus-info <argus-info at lists.andrew.cmu.edu> wrote:
>> 
>> Hi list,
>> 
>> I am investigating all of the bits need to get network monitoring up and running for AARNET.
>> Front-end most likely would involve the ELK stack in some way with Argus providing the probes.
>> 
>> However we are interested in getting our data from the routers rather than network interfaces.
>> But we have settled on IPFIX. Feeding IPFIX flows into the Argus rabins client seems to work, sort of.
>> 
>> There are 2 issues I need to address.
>> 	• When will proper IPFIX support be available?
>> 	• What are the limitations of feeding IPFIX flows into the front end of rabins when it expects NetFlow 9. (I’m just the programmer not the network expert.)
>> 	• Feeding IPFIX data into rabins causes it to blow up pretty quick with a major memory leak. I have studied this with heaptracker, but no definite conclusion yet.
>> Regards from Richard
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 6837 bytes
Desc: not available
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20160210/62f82965/attachment.bin>


More information about the argus mailing list