real time flow classification

Carter Bullard carter at qosient.com
Mon Oct 8 11:09:44 EDT 2012


Hey Oguz,
You may find that 1G networks don't need so much effort to monitor using argus,
but anything faster will need hardware and kernel support.  This is because the
improvements in memory bandwidth and bus architecture have made packet
processing much easier.  PF_RING is great when more than one application
needs to read a single packets stream, or when memory copies of packets 
are expensive, say when you are doing complete packet capture.  Any experience
you have using DNA will help you along the path.

I don't test argus outside of the collection of machines and networks that I work
with everyday, so I don't test with PF_RING, or with many of the capture cards
that are discussed on the list, as I don't personally run into them.  But the whole
idea of the mailing list is to bring the community together that does use these
things, to figure out if its reasonable.

I think people are pleased with PF_RING DNA, although it does have some
new features that need to be rung out, so I can endorse PF_RING.

Intel cards are at the bottom of the capture card list, Mircom are very good for
the price, Napatech and Endace are expensive, but excellent capture cards.
All are much, much better than integrated ethernet interfaces, but some of those
will keep up with traditional (low utilized) 1G interfaces.

Because you can print comma separated fields, database importation is pretty
easy, and because we support XML outputs, its even easier.  Native database
applications do better,  but not 10x better.  If you keep the number of fields down,
and the number of key attributes down, then performance can be good.

If you would like to write your own, rasql() reads database data from a mysql
database, and rasqlinsert() both inserts and reads data.  rasqlinsert() is a bit
complicated, but its not a trivial task when you really get into it.  The email list
is always a good place to ask.

I would suggest that you think about your data model some more.  The data you
are interested in doesn't point to a useful query taxonomy.  What problem is
your data going to solve, and how are you going to query the database to
solve it?  The database and its schema is an implementation of a data mining
strategy.  The strategy is the important part of what you seem to be doing.

I still have little understanding what Num-SYN, Num-FIN or Num-RST reveals.
Average window size also doesn't tell me much, but instantaneous window
size with good sample rates tells me a lot.

My philosophy with regard to where analytics should be implemented in the
packet to report pipeline (in argus or in an argus client), is to put generic
algorithmic support in argus, and do the clever work in the clients.
High performance is critical for the packet processing engine, once the data
is out of argus, you have literally all day to figure it out what the problem is.

Hope this is useful, and thanks to emailing the list.  Hopefully everyone will
benefit from the exchange !!!!!!


Carter 


On Oct 6, 2012, at 2:22 PM, Oguz Yarimtepe <oguzyarimtepe at gmail.com> wrote:

> Hi,
> 
> I am a Ph.D student from Turkey. My subject is real time security visualization. Now a days i am searching about real time traffic classification. My first aim is to collect real time flow attributes on a database. It seems Argus can save to MySQL but i may be using a non-sql database. I am also planning to use TNAPI and PF_RING with an Intel card on a PREEMPTIVE_KERNEL. So i have some questions. It would be great if you readirect me to some sources.
> 
> * Did you test Argus with TNAPI + PF_RING? Will it need any modifications for working with PREEMPTIVE_KERNEL and these drivers or how can i use it to sniff 1Gbit campus network? My Intel card is not Endace card, it is Intel Corporation 82572EI Gigabit Ethernet Controller. So with my commodity hardware (that is a dual core pentium 4 with 4 GB RAM, an old IBM ThinkCenter Workstation) i will be trying to collect the Gbit traffic.
> 
> * I will be saving some flow attributes to database. These attributes will be like;
> Total-num-pkt (Total number of packets in the flow), Ave-pkt-len (The average packets size of a flow), Pkts-sent (The number of packets sent for the flow), Send-avelen (The average send packets size of a flow), Send-var (The variance of send packets’ size), Recv-avelen (The average receive packets size of a flow), Recv-var (The variance of receive packets’ size), Var-recv-size (The variance of received packets’ size), Duration (The duration of the flow), Protocol (The protocol (TCP or UDP)), Send-port (The source port of a flow), Recv-port (The destination port of a flow), Pkts-ratio (The number ratio of send and receive packets), Byte-ratio (The byte ratio of send and receive packets), Num-SYN (The number of SYN packets), Num-RST (The number of RST packets (rst)), Num-FIN (The number of FIN packets (fin)), Window-size (The average window size (window_size)), Window-var (The variance of window size)
> 
> I know i can get most of these values. How about Num-SYN, Num-FIN and Num-RST?
> 
> 
> * What should i do to extend Argus save to a non-SQL database?
> 
> * Indeed my plan is to apply a classifier algorithm to the attribues i got and make outlier detection. If it will be less painfull, i may try to implement it directly at Argus code, but i should mention i will be preferring the time-saving option for now.
> 
> Thank you for now.
> Cheers.
> 
> -- 
> Oguz Yarimtepe <oguzyarimtepe at gmail.com>
> http://about.me/oguzy
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4367 bytes
Desc: not available
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20121008/c41721c7/attachment.bin>


More information about the argus mailing list