Covert Channel Detection

Fri Jul 21 08:38:21 EDT 2000

Gentle people,
   Russell has a great point,  for the most part
its simple flow statistics, that can tell you that
something is amuck on a given port.  And to address
one of Peter's concerns, even with encryption, in
many cases, the flow statistics will give away the
ghost.

   Here is an example from the distant past, and
I'm pulling the numbers from memory so at least
take them for order of magnitude correctness ;o)
At CMU we had statistics of over 10,000,000 SMTP
conversations that showed that the average SMTP
transaction involved 13 packets from the source and
11 packets from the destination, with a pretty tight
Std Dev.  The balance of user data was about 236
bytes from the source and about 69 bytes from the
destination, on average (mail moves from source to
destination).  This was the expected for a single
piece of SMTP mail, and the whole thing lasted about
0.356 seconds, on average.

   When someone telnets to the SMTP port, say to
query the SMTP server for lists of mail addresses,
which is not a great thing to do, these simple stats
are completely out of whack.  And the key to detection
involves a large number of parameters, so there is good
sensitivity for a detector, without many false
positives.  The duration of the connection, the
total number of packets, the ratio of source/destination
packet number, and the balance of data shifting from the
source being dominate to the destination being dominate.
What also works well, but not supported currently in Argus,
is the burst behavior of the packets in the flow, is also
completely off.

   Now in this example, with all the encryption in the
world, we still know that its not SMTP traffic!  We
won't know what it is, but absolutely, we'll know its
not SMTP!

   Russell hit the nail on the head.  The trick is having
the protocol baseline data, somewhere, so we can have
something to compare to.  This I believe is something
that we should strive for in Argus-2.x.

   I want to add user data tags so that we can try to
identify protocols when the ports are arbitrary, and also
so we can deal with the statistical variability in
normal application behavior.  But, if we do a great job
on transaction behavior fingerprinting/baselining,
we may not need the tags.  I'll have to think about
how much data we'll need to figure out the answer to
that problem.

   The interesting part of the problem is, what do we
want to store for the application/service baseline
database?  Any ideas?

Carter

-----Original Message-----
From: owner-argus at lists.andrew.cmu.edu
[mailto:owner-argus at lists.andrew.cmu.edu]On Behalf Of Russell Fulton
Sent: Friday, July 21, 2000 12:37 AM
To: Argus (E-mail)
Subject: Re: Covert Channel Detection

On Thu, 20 Jul 2000 17:32:44 -0400 Carter Bullard <carter at qosient.com> 
wrote:

>    I think that Argus can do the best job at
> this by doing some pattern recognition in the
> user traffic.  I think for most purposes,
> being able to validate the protocol above the
> transport layer would be a good start.
> 
>    Is anyone interested in this type of work?

I am.  We have already seen some interesting covert channels in the 
DDoS tools (eg using ECR packets to carry commands) and we saw a 
facinating demonstration at the FIRST meeting in Chicago of using SSL 
to tunnel all sorts of traffic through a compromised web server and 
thus circumventing a firewall.  

[Aside : this was the most impressive live demo I have ever seen -- it 
took several hours and they had 5 machines and a router hooked up at 
the front of the meeting room. There was only one minor glich and we 
paused 5 minutes early for morning tea while they figured 
out what was wrong. It turned out to be finger trouble...]

There are two approaches I see to this problem: 

Provide an argus client that calculate various standard stats about 
different protocols that we might use to characterise them. This is 
what raservices does.  The logical extention is to have a client that 
loaded in a set of parameters and watched for flows that lay outside 
the distributions.

The other approach, which I perfer because I like tinkering with 
things, is to provide good access to the raw argus data so that we can 
easily extract data that we can feed into a stats package to do more 
sophisticated analysis.  The experimentalist's approach ;-)

I suspect that we will need different statistics to detect different 
types of covert channels.  eg.  simple mean and std on flow sizes might 
be enough in some cases, in other we may have to look at size or timing
distributions of individual packets.  (Netramet will do that for you ;-)

Unfortunately I don't have enought time to do everything that I as 
supposed to do now without starting research project on things like 
this.

Cheers, Russell