passive host characterization

Carter Bullard carter at qosient.com
Wed May 20 12:11:33 EDT 2009


Hey Nick,
We can already report most of the data need to drive p0f's database,
but, you have to use the ARGUS_GENERATE_START_RECORDS
configuration option to have a single argus record that has all the
data in it.  We have everything that p0f needs for the Syn except:
    TCP Header reporting:
       URG pointer value

    TCP Option reporting:
       MSS value
       EOL presence
       NOP presence
       TIMESTAMP value
       Option order

p0f when looking at the response, needs an ACK seq number, which
we have, but we don't preserve it if the connection continues, and
we don't pay any attention to the TCP Timestamp, so that data is lost
today.

I can add a DSR that is specific to OS fingerprints, and start  
collecting
the missing/non-retained information.  That will take just a little  
bit of time.

I'd like to support a strategy that is currently being maintained.  My  
goal
is to go after OS virtualization characterization, so a least a quasi- 
modern
database would be ideal.

Carter



On May 20, 2009, at 11:26 AM, Nick Diel wrote:

> Carter,
>
> I think taking advantage of p0f's (http://lcamtuf.coredump.cx/p0f.shtml 
> ) passive OS fingerprinting engine would be a good baby step towards  
> your goals.  While it is not really behavioral analysis, it is  
> something very easy to use and would give you a good indication of  
> the host OS.
>
> Even with the p0f database being old, I have found it to be quite  
> accurate on my data sets, especially windows vs. unix.  Things get  
> interesting when looking at unknown signatures, hosts who have  
> different signatures on different ports, or hosts who signature  
> changes.
>
> I believe argus already captures some of the components needed by  
> p0f.  For OS detection based on syn packets p0f uses:
> # wwww     - window size (can be * or %nnn or Sxx or Txx)
> #         "Snn" (multiple of MSS) and "Tnn" (multiple of MTU) are  
> allowed.
> # ttt      - initial TTL
> # D        - don't fragment bit (0 - not set, 1 - set)
> # ss       - overall SYN packet size (* has a special meaning)
> # OOO      - option value and order specification (see below)
> # QQ       - quirks list (see below)
>
> Options:
> # N       - NOP option
> # E       - EOL option
> # Wnnn       - window scaling option, value nnn (or * or %nnn)
> # Mnnn       - maximum segment size option, value nnn (or * or %nnn)
> # S       - selective ACK OK
> # T        - timestamp
> # T0       - timestamp with zero value
> # ?n       - unrecognized option number n.
>
> Quirks:
> # P     - options past EOL,
> # Z    - zero IP ID,
> # I    - IP options specified,
> # U    - urg pointer non-zero,
> # X     - unused (x2) field non-zero,
> # A    - ACK number non-zero,
> # T     - non-zero second timestamp,
> # F     - unusual flags (PUSH, URG, etc),
> # D     - data payload,
> # !     - broken options segment.
>
> Just my 2 cents,
> Nick
>
> On Wed, May 20, 2009 at 7:52 AM, Carter Bullard <carter at qosient.com>  
> wrote:
> Gentle people,
> There are about a 100 topics for discussion regarding flow data, and I
> would like to get some ideas/comments/reactions/opinions on better
> host characterization.  Now that we have database support and
> flow data labeling, it would be nice if we could add things like,
> "I think this is a Mac", and then have something check that it was
> a Mac the last time we looked.  Anomaly detection at its finest ;o)
>
> Most OS fingerprinting today is done from packet header peculiarities
> and responses from specific challenges.  This type of characterization
> strategy has a few drawbacks:  1) its a pattern matching strategy,  
> which
> has its limitations and 2) many times it involves active methods,  
> where
> you have to challenge the machine to get it to tell you what it is,  
> which
> has another set of limitations, especially when you deal with  
> historical
> data and you can't go back in time to probe the machine to tell you
> what it was.   There is nothing wrong with these strategies, but there
> should be other things we can do.
>
> If you look at a lot of argus data, you probably know  that most  
> machines
> give away what they are, or rather what they do and how they do it, by
> accessing specific machines (license servers, update servers),  
> requesting
> specific DNS lookups, broadcasting availability of resources, or use
> specific protocol types, like routing protocols, etc....
>
> Game machines are easy to see, routers, Mac's, Windows machines,
> etc ...., all seem to do basically different things when they come up.
>
> I have a TiVo in my office, and I know its a TiVo because its always
> wanting to connect to the mothership to participate in the various  
> TiVo
> services.  Argus data, with ralabel() adding the DNS domain name
> for the destination address to the flow record, tells me that the  
> src IP
> address (which is DHCP'd) is the TiVo.   (don't really need the DNS
> name, but it helps to explain the example).
>
> Now that Nero LiquidTV allows you to turn a PC into a TiVo,
> it would be interesting to know if I could discriminate that behavior
> from a real TiVo.
>
> When you consider OS virtualization and the need to understand what
> is going on in your network, this type of problem can be generalized
> into an interesting problem.
>
> I'm thinking that developing a compendium of host behaviors,  
> especially
> boot behaviors, would have some benefit, and I'm wondering if there is
> interest in talking about it, and possibly doing it.
>
> Argus currently captures a few of the packet peculiarities that are  
> used in
> contemporary OS identification, but it is not trying to specifically  
> do this.
> I'm interested in understanding what we can do to add this feature  
> to argus,
> and  I'm very interested in going after the behavioral aspects of  
> network
> traffic, to do a better job.
>
> What do you think?
>
> Carter
>
>

Carter Bullard
CEO/President
QoSient, LLC
150 E 57th Street Suite 12D
New York, New York  10022

+1 212 588-9133 Phone
+1 212 588-9134 Fax



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20090520/d68dc57f/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3815 bytes
Desc: not available
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20090520/d68dc57f/attachment.bin>


More information about the argus mailing list