argus user data buffer analysis

Oguz Yarimtepe comp.ogz at gmail.com
Sun Feb 8 04:47:37 EST 2009


On Tue, 2009-02-03 at 10:46 -0500, Carter Bullard wrote:
> Hey Oguz,

Hi, Carter

> We have undocumented programs in the argus-3.0.0 distribution
> that do something similar to what you are interested in, I think.
> These are undocumented because its a bit complex, and I haven't
> had a chance to describe them yet.
> 
Sorry for my late reply. I have already found time to check this issue.
> 
> If you are interested in helping to make these types of programs
> useful, I'll be more than happy to describe how they work, and
> share my experiences and help to make them better.

I don't have so much time to contribute for the code development issue
but i can use the tool and share my experiences, bugs i found, ... etc.
And after a while write some documentation for the other people who will
use it.  
> 
> The undocumented program rauserdata() will analyze the user
> data portion of argus records, and generate a signature pattern
> configuration for the protocols it encounters in the data set.
> The algorithm is very simple, and pretty powerful, in that it makes
> no assumptions about the user data.   But, you have to be careful
> with the data that you give the engine.  Below is a little background.

> I have found that for the set of protocols you listed, the
> first 32 bytes of data is all that is needed to reliably identify the
> protocol type.  This is because, each of your protocols have
> unique greeting identifiers, and for the ones you listed, the
> identifiers are all in ascii.
> 
> 
> Because argus provides multiple status reports for long lived
> flows, not all argus records for a given flow will contain the
> "first N byte" signatures that you are seeking.  
> 

My plan to identify the application level protocols was defining a
characteristic for each of them. There is a work done before related
with it[1] It is written that the results are the ones of some
observation. So i am not so sure about the criteria they suggest. 

[1]
http://www.crc.ca/files/crc/home/research/network/system_apps/network_systems/network_security/publications/ADeMontigny_CRCTN2005003.pdf (you may check from pg 28 for a fast view)
> 
> Using racluster() on your 'primitive' argus data will usually provide
> you with the "first N bytes" of user data, so that your search for
> tokens and patterns can be reliable.
> 
> 
> Try this out for a while to see if you get anything useful:
> 
> 
>    racluster -r /a/days/worth/of/data/of/interest/* -w /tmp/day.cache
> 
> 
>    rauserdata -r /tmp/day.cache | less

I have already installed argus 3 from source. I didn't find rauserdata
command. Should i compile in a differenet way or what?
> 
> 
> You should get an output that is arraigned by 'protocol/port' and
> you should see a set of source and destination user data buffers
> that have the "greatest likelyhood" patterns for that "proto/port"
> pair.
> 
> 
> ra() prints the user data buffer with an ASCII encoding by default,
> and
> so you should see some patterns in the buffers it outputs.
> if you see a '.', that is generally a non printable character.
> 
> 
> The ides is to build up configuration files of signatures using
> rauserdata(),
> and the program raservices(), will take the rauserdata() output as
> a configuration file, and label flows with the tags that identify the
> protocol.

Will try whenever i managed to find the rauserdata command at my
system :)
I am testing these commands on a dataset i found on the net[2]

Is there any other suggestion you have? And i am not sure about whether
the results are true or not? Is there any tools that shows the traffic
details in a detail view, something like MRTG? (I am not sure whether i
am able to use an offline tcpdump record with MRTG)

[2]
http://www.ll.mit.edu/mission/communications/ist/corpora/ideval/data/1999data.html
> 
> 
> Give it a try, and I'd love to see/hear your comments.

Thanx.
> 
> 
> Carter
> 
> On Feb 3, 2009, at 12:59 AM, Oguz Yarimtepe wrote:
> 
> > 
> >         
> >         
> >                Depends on what you need. If you enable user data
> >         capture (the -U
> >         option on the argus) it will capture up to the first 512
> >         bytes of the user
> >         data of the flow. That may or may not give you enough
> >         information about the
> >         flow to do what you want. Note that on a fast link best
> >         results are going to
> >         occur using a DAG card as the data capture adds a fairly
> >         heavy load to the
> >         server. To display the data with ra (for instance) you need
> >         to use the -s
> >         command to add suser and duser to the output (as in
> >         
> >         ra -r argus_file -n -s +suser:512 -s +duser:512
> >         
> >         which will tack the user data on the end of the line. This
> >         of course raises a
> >         number of sticky privacy issues that you need to have
> >         considered and gotten
> >         approved by appropriate management of the link you are
> >         tapping (which may or
> >         may not be you :-)).
> >         
> >         Peter Van Epp
> > 
> > What i am willing to do is to characterize the  network traffic by
> > using some characteristics derived from flow information. My final
> > decision about a flow record will be whether the flow belongs to a
> > chat session, a mail transfer, a FTP connection, a web browsing, ...
> > 
> > 
> > I had discovered Bro which has identifiers related with high level
> > protocols. The protocol family that it supports is not as much as
> > Argus does so i was planning to go on with Argus. 
> > 
> > 
> > -- 
> > Oğuz Yarımtepe
> > www.loopbacking.info
> 
> Carter Bullard
> CEO/President
> QoSient, LLC
> 150 E 57th Street Suite 12D
> New York, New York  10022
> 
> 
> +1 212 588-9133 Phone
> +1 212 588-9134 Fax
> 
> 
> 
> 
> 
> 




More information about the argus mailing list