argus user data buffer analysis

Carter Bullard carter at qosient.com
Mon Feb 9 14:29:56 EST 2009


Hey Oguz,
I thought the code rauserdata.c and raservices.c were in the argus- 
clients-3.0.1
distribution, but they are not.  I will add them today, and release a  
new distro
on Wed.  Please remind me if I let this lapse, as this week will be  
busy for me.

I looked at your references, and yes to all, these techniques are the  
types of
techniques I would like argus data to support.  We are already  
supporting
most of what Vern's article mentioned with regard to interpacket  
arrival times,
and the data needed to do what Annie De Montigny-Leboeuf is/was doing
are either in argus or are in Gargoyle (which means it should be in  
argus
sometime this year) with a few exceptions.

How would you guys like to proceed in this area?  Do you want to build  
some
specific examples of classification?

Hope all is well, and I look forward to working with you and others on  
this very
interesting topic!!!

Carter

On Feb 8, 2009, at 4:47 AM, Oguz Yarimtepe wrote:

> On Tue, 2009-02-03 at 10:46 -0500, Carter Bullard wrote:
>> Hey Oguz,
>
> Hi, Carter
>
>> We have undocumented programs in the argus-3.0.0 distribution
>> that do something similar to what you are interested in, I think.
>> These are undocumented because its a bit complex, and I haven't
>> had a chance to describe them yet.
>>
> Sorry for my late reply. I have already found time to check this  
> issue.
>>
>> If you are interested in helping to make these types of programs
>> useful, I'll be more than happy to describe how they work, and
>> share my experiences and help to make them better.
>
> I don't have so much time to contribute for the code development issue
> but i can use the tool and share my experiences, bugs i found, ...  
> etc.
> And after a while write some documentation for the other people who  
> will
> use it.
>>
>> The undocumented program rauserdata() will analyze the user
>> data portion of argus records, and generate a signature pattern
>> configuration for the protocols it encounters in the data set.
>> The algorithm is very simple, and pretty powerful, in that it makes
>> no assumptions about the user data.   But, you have to be careful
>> with the data that you give the engine.  Below is a little  
>> background.
>
>> I have found that for the set of protocols you listed, the
>> first 32 bytes of data is all that is needed to reliably identify the
>> protocol type.  This is because, each of your protocols have
>> unique greeting identifiers, and for the ones you listed, the
>> identifiers are all in ascii.
>>
>>
>> Because argus provides multiple status reports for long lived
>> flows, not all argus records for a given flow will contain the
>> "first N byte" signatures that you are seeking.
>>
>
> My plan to identify the application level protocols was defining a
> characteristic for each of them. There is a work done before related
> with it[1] It is written that the results are the ones of some
> observation. So i am not so sure about the criteria they suggest.
>
> [1]
> http://www.crc.ca/files/crc/home/research/network/system_apps/network_systems/network_security/publications/ADeMontigny_CRCTN2005003.pdf 
>  (you may check from pg 28 for a fast view)
>>
>> Using racluster() on your 'primitive' argus data will usually provide
>> you with the "first N bytes" of user data, so that your search for
>> tokens and patterns can be reliable.
>>
>>
>> Try this out for a while to see if you get anything useful:
>>
>>
>>   racluster -r /a/days/worth/of/data/of/interest/* -w /tmp/day.cache
>>
>>
>>   rauserdata -r /tmp/day.cache | less
>
> I have already installed argus 3 from source. I didn't find rauserdata
> command. Should i compile in a differenet way or what?
>>
>>
>> You should get an output that is arraigned by 'protocol/port' and
>> you should see a set of source and destination user data buffers
>> that have the "greatest likelyhood" patterns for that "proto/port"
>> pair.
>>
>>
>> ra() prints the user data buffer with an ASCII encoding by default,
>> and
>> so you should see some patterns in the buffers it outputs.
>> if you see a '.', that is generally a non printable character.
>>
>>
>> The ides is to build up configuration files of signatures using
>> rauserdata(),
>> and the program raservices(), will take the rauserdata() output as
>> a configuration file, and label flows with the tags that identify the
>> protocol.
>
> Will try whenever i managed to find the rauserdata command at my
> system :)
> I am testing these commands on a dataset i found on the net[2]
>
> Is there any other suggestion you have? And i am not sure about  
> whether
> the results are true or not? Is there any tools that shows the traffic
> details in a detail view, something like MRTG? (I am not sure  
> whether i
> am able to use an offline tcpdump record with MRTG)
>
> [2]
> http://www.ll.mit.edu/mission/communications/ist/corpora/ideval/data/1999data.html
>>
>>
>> Give it a try, and I'd love to see/hear your comments.
>
> Thanx.
>>
>>
>> Carter
>>
>> On Feb 3, 2009, at 12:59 AM, Oguz Yarimtepe wrote:
>>
>>>
>>>
>>>
>>>               Depends on what you need. If you enable user data
>>>        capture (the -U
>>>        option on the argus) it will capture up to the first 512
>>>        bytes of the user
>>>        data of the flow. That may or may not give you enough
>>>        information about the
>>>        flow to do what you want. Note that on a fast link best
>>>        results are going to
>>>        occur using a DAG card as the data capture adds a fairly
>>>        heavy load to the
>>>        server. To display the data with ra (for instance) you need
>>>        to use the -s
>>>        command to add suser and duser to the output (as in
>>>
>>>        ra -r argus_file -n -s +suser:512 -s +duser:512
>>>
>>>        which will tack the user data on the end of the line. This
>>>        of course raises a
>>>        number of sticky privacy issues that you need to have
>>>        considered and gotten
>>>        approved by appropriate management of the link you are
>>>        tapping (which may or
>>>        may not be you :-)).
>>>
>>>        Peter Van Epp
>>>
>>> What i am willing to do is to characterize the  network traffic by
>>> using some characteristics derived from flow information. My final
>>> decision about a flow record will be whether the flow belongs to a
>>> chat session, a mail transfer, a FTP connection, a web browsing, ...
>>>
>>>
>>> I had discovered Bro which has identifiers related with high level
>>> protocols. The protocol family that it supports is not as much as
>>> Argus does so i was planning to go on with Argus.
>>>
>>>
>>> -- 
>>> Oğuz Yarımtepe
>>> www.loopbacking.info
>>
>> Carter Bullard
>> CEO/President
>> QoSient, LLC
>> 150 E 57th Street Suite 12D
>> New York, New York  10022
>>
>>
>> +1 212 588-9133 Phone
>> +1 212 588-9134 Fax
>>
>>
>>
>>
>>
>>
>
>

Carter Bullard
CEO/President
QoSient, LLC
150 E 57th Street Suite 12D
New York, New York  10022

+1 212 588-9133 Phone
+1 212 588-9134 Fax






More information about the argus mailing list