normalized appbyte ratio for producer/consumer relationship
Carter Bullard
carter at qosient.com
Wed May 29 10:12:15 EDT 2013
Hey John,
Just some dialog to talk about how to use ABR ( or should it be PCR, for producer /
consumer ratio ? ). I'm going to go through a random recent day, to see if there are
trends, or gottcha's using the number. So far, we're saying that if a network
behavior is near the 1.0 or -1.0 value, generally, then transformation to the
opposite side of the scale should be easily detectable and immediately relevant
to the cyber security problem set. So if a host is a consumer, for it to all of sudden
become a producer, should be a problem. So lets see what's up.
Here is a daily report for one of my big networks looking at only the ABR metric per address.
As you know, I have a rasqlinsert() process that is maintaining a ( mac / ip address ) inventory
table that is updated in near realtime (< 5 sec off realtime), that captures the abr variable.
Here is a printout for the Monday. Off day, US holiday. Here are all the local IP addresses
from the perspective of the interior interfaces of my border routers:
thoth:~ carter$ rasql -M time 1d -t -2d+1d -r mysql://root@localhost/ratop/etherHost_%Y_%m_%d \
-w - - net 192.168.0.0/16 and srcid 192.168.0.1 and not ether src ff:ff:ff:ff:ff:ff | \
rasort -m abr -s ltime dur smac saddr spkts dpkts abr -p3
LastTime Dur SrcMac SrcAddr SrcPkts DstPkts ABRatio
2013/05/27.23:56:55.839 85848.281 00:21:5a:39:d7:a2 192.168.0.127 1458 0 1.000 P
2013/05/27.17:57:56.563 42787.016 74:44:01:8f:82:fc 192.168.0.47 2 0 1.000 P
2013/05/27.04:20:50.044 0.000 00:12:3f:bc:58:a4 192.168.0.70 1 0 1.000 P
2013/05/27.18:23:04.308 10689.529 00:15:17:78:08:8f 192.168.0.43 768 2 0.994 P
2013/05/27.23:50:49.394 85627.617 90:84:0d:ef:70:0b 192.168.0.34 179 24 0.954 P
2013/05/27.23:50:49.418 85627.672 90:84:0d:d2:2a:8e 192.168.0.2 160 25 0.939 P
2013/05/27.23:57:22.091 86073.227 00:0b:db:5c:e5:7c 192.168.0.164 624 254 0.426 B
2013/05/27.23:59:57.846 86389.781 00:11:d9:31:c1:11 192.168.0.33 15457 11419 0.357 B
2013/05/27.23:59:21.240 86337.547 80:71:1f:3c:c3:88 192.168.0.1 3563 3602 -0.013 B
2013/05/27.23:59:58.436 86391.156 c8:2a:14:58:7a:55 192.168.0.66 19216 15089 -0.295 B
2013/05/27.23:59:48.667 86356.648 00:23:32:2f:ac:9c 192.168.0.68 467274 640295 -0.986 C
2013/05/27.23:58:39.239 86287.312 1c:ab:a7:8b:ae:ce 192.168.0.41 49007 87315 -0.999 C
I added the P, B and C by hand for this, but ralabel() can easily provide this label.
So, this is basically the internal IPs that my border router sees, that are mapped to real ethernet addresses.
Every flow is aggregated into a single entry per address, so you're getting everything
that this observation domain saw for these IP addresses.
So what have we got? We've got a few pure producers, a few mixed, and a few pure
consumers. Lets talk about them….
192.168.0.127 is one of my printers that is adversing away, saying I'm here…. We don't see
any actual traffic going to and from the printer to support the actual printing function, so its
very nice that we see the printer as a pure 1.000 producer. If anyone from the outside
used this printer, this number would go down, or if this printer started sending data to
its manufacturer, or started to transmit the pages it printed to a 3rd party, regardless
of protocol, this numberwould deviate from 1.0 and we'd have a real anomaly. That's easy.
The traffic seen for 192.168.0.47 and 192.168.0.70 are just ARP broadcast requests, without
any responses, as we're on the wrong host to see any answers. These packets are important,
as we can realize that these hosts are there, and get some sense of their activity / health.
But for the purposes of this study, they influence the numbers a bit.
So, lets run this again with a filter to toss the arp packets, to see what we get. Lets add the filter
" and ip ", tossing the broadcast ethernet part of the filter.
thoth:~ carter$ rasql -M time 1d -t -2d+1d -r mysql://root@localhost/ratop/etherHost_%Y_%m_%d \
-w - - net 192.168.0.0/16 and srcid 192.168.0.1 and ip | \
rasort -m abr -s ltime dur smac saddr spkts dpkts abr -p3
LastTime Dur SrcMac SrcAddr SrcPkts DstPkts ABRatio
2013/05/27.23:56:55.839 85848.281 00:21:5a:39:d7:a2 192.168.0.127 1458 0 1.000 P
2013/05/27.18:23:04.308 10689.529 00:15:17:78:08:8f 192.168.0.43 768 2 0.994 P
2013/05/27.23:50:49.394 85627.617 90:84:0d:ef:70:0b 192.168.0.34 179 24 0.954 P
2013/05/27.23:50:49.418 85627.672 90:84:0d:d2:2a:8e 192.168.0.2 160 25 0.939 P
2013/05/27.23:57:22.091 86073.227 00:0b:db:5c:e5:7c 192.168.0.164 624 254 0.426 B
2013/05/27.23:59:57.846 86389.781 00:11:d9:31:c1:11 192.168.0.33 15457 11419 0.357 B
2013/05/27.23:59:58.436 86391.156 c8:2a:14:58:7a:55 192.168.0.66 19216 15089 -0.295 B
2013/05/27.23:59:48.667 86356.648 00:23:32:2f:ac:9c 192.168.0.68 467274 640295 -0.986 C
2013/05/27.23:59:57.846 85974.359 ff:ff:ff:ff:ff:ff 192.168.0.255 0 5214 -1.000 C
Looking good !!! So, now we've got more bidirectional flows, so that's a good thing.
ARPs are important, but the producer / consumer relationship for control plane traffic
will tend to be balanced. ARP, DNS, DHCP, OSPF, BGP, SIP come up pretty balanced
most of the time…..
OK going through the list again, looking at the the other extreme. 192.168.0.255 is the
local subnet broadcast address, and so it really needs to be a pure consumer of data,
as its illegal for that address to transmit data on an IETF standard internet. Now
its easy for a system to use that address on your network, so watching for a deviation
from 1.0 for this address is a good thing to look out for. It would have a different mac
address, but the abr is still a good indicator.
Now, 192.168.0.34 is a bit of an odd ball, as its a control system, so its really only
trying to sync its clock, using NTP, but its also advertising availability through mdns…..
which really makes it look like a big producer… So its a producer into the multicast
address space. We should get rid of multicast for this, I suspect…., which would
convert this address into a true balanced producer / consumer, so that we would
know that its support infrastructure.
Addresses 192.168.0.164, and 192.168.0.66 are also infrastructure systems, so
being closer to 0 is the right place for these guys. While the notion that balanced
assets are infrastructure or control is a bit of a stretch, if there is a trend, that would
be one I would analyze.
OK, that leaves us with 192.168.0.68, which is the most active desktop.
While the packet ratio looks to be balanced, -0.156, the application bytes are
way in the direction of consumer, and that is where we want it to be, which to me
validates the metric. We want this host to consumer the Internet, not be
consumed by it.
So while QoSient, world headquarters may seem small, at least to the outside world,
on Monday, it was doing what it was suppose to do !!! Later I'll send either todays
or tomorrows list to see how they compare.
Carter
On May 7, 2013, at 11:31 PM, Carter Bullard <carter at qosient.com> wrote:
> Hey John,
> Here is a run to try out the abr metric for basic producer/consumer anomaly detection.
> Picked a bunch of hosts that are involved in a service that runs on a specific port.
> There is a data transfer function, and there is a command and control function.
> The abr metric picks out the producer and consumers for the data transfer, and it points to those that are involved in command and control, here you go.
>
> thoth:backups root# racluster -M rmon -m saddr -r monthly.data -w - | \
> rasort -m abr -s stime dur:16 proto saddr spkts:12 dpkts:12 abr
> StartTime Dur Proto SrcAddr SrcPkts DstPkts ABRatio
> 2013/02/05.14:03:48.304265 2022912.500000 tcp 192.168.1.31 118813516 85735688 0.999937
> 2013/02/06.16:19:37.099642 1895160.500000 tcp 192.168.2.75 3621 1899 0.997145
> 2013/02/07.12:06:09.973606 27554.181641 tcp 192.168.2.34 732 650 0.915472
> 2013/02/27.11:40:47.093087 4.957941 tcp 192.168.4.35 13 12 0.551477
> 2013/02/18.14:17:01.287529 758586.750000 tcp 192.168.7.166 23 18 0.422487
> 2013/02/06.10:10:26.473381 30.864573 tcp 192.168.12.149 7 5 0.295455
> 2013/02/13.16:45:55.793151 11.732343 tcp 192.168.7.155 15 13 0.250169
> 2013/02/06.09:31:55.244962 1228359.500000 tcp 192.168.2.54 32 35 0.241102
> 2013/02/08.10:09:14.794703 1727145.250000 tcp 192.168.0.125 21 18 0.213155
> 2013/02/06.13:17:45.550931 1819270.875000 tcp 192.168.1.138 1227 1231 0.135222
> 2013/02/15.11:28:36.104691 50.191151 tcp 192.168.2.71 75 59 0.100396
> 2013/02/07.14:00:35.770555 83.157898 tcp 192.168.0.70 894 890 0.057422
> 2013/02/12.14:19:09.720183 1.038289 tcp 192.168.1.125 7 8 0.029588
> 2013/02/21.03:07:04.628043 32.868526 tcp 192.168.0.72 7 5 0.023810
> 2013/02/11.08:39:44.024865 1478973.500000 tcp 192.168.2.45 268 157 -0.015058
> 2013/02/19.14:25:03.376258 28.092548 tcp 192.168.7.153 7 5 -0.043860
> 2013/02/06.14:05:03.101059 65.313805 tcp 192.168.2.58 6 7 -0.055469
> 2013/02/06.13:17:45.550931 1772498.375000 tcp 192.168.2.57 13 16 -0.164201
> 2013/02/06.12:20:05.543201 1913705.250000 tcp 192.168.2.138 256 317 -0.173220
> 2013/02/20.12:43:35.083127 424459.562500 tcp 192.168.2.102 1147 1160 -0.343431
> 2013/02/20.13:46:25.772704 940.071899 tcp 192.168.12.2 16 17 -0.392946
> 2013/02/07.12:55:49.369160 1822594.375000 tcp 192.168.1.110 296 400 -0.456629
> 2013/02/06.12:51:40.056876 1901223.250000 tcp 192.168.1.123 165 222 -0.601214
> 2013/02/06.09:37:40.607721 1900794.750000 tcp 192.168.1.111 681 995 -0.612489
> 2013/02/05.14:44:23.517669 1818851.000000 tcp 192.168.1.117 79 102 -0.627868
> 2013/02/05.14:05:30.070586 1889467.750000 tcp 192.168.1.127 143 175 -0.669750
> 2013/02/06.09:31:55.244962 697157.125000 tcp 192.168.1.115 51 56 -0.683094
> 2013/02/13.16:07:23.915251 1291093.750000 tcp 192.168.1.130 61 75 -0.694824
> 2013/02/07.13:33:41.019318 1557935.000000 tcp 192.168.1.113 100 133 -0.706807
> 2013/02/06.12:06:43.001669 1721456.875000 tcp 192.168.1.112 618 1010 -0.740505
> 2013/02/23.17:46:07.768913 0.588042 tcp 192.168.9.84 12 15 -0.758408
> 2013/02/06.11:31:22.894660 1906436.625000 tcp 192.168.1.122 76 101 -0.762083
> 2013/02/05.15:07:18.072928 1977283.125000 tcp 192.168.1.114 90 120 -0.781786
> 2013/02/08.09:07:55.007936 1140006.125000 tcp 192.168.1.118 112 136 -0.787711
> 2013/02/05.14:04:02.548134 2022898.250000 tcp 192.168.2.47 327011 243367 -0.795821
> 2013/02/06.14:30:10.891759 1905899.875000 tcp 192.168.1.116 343 588 -0.934336
> 2013/02/07.12:06:09.973606 1807785.625000 tcp 192.168.1.121 2579 4398 -0.990442
> 2013/02/05.14:03:48.304265 1465134.125000 tcp 192.168.2.29 85407678 118569168 -0.999999
>
>
> So we take a some records, in this case a complete month's worth of traffic involved in a specific application,
> involving a specific subnet. We want to know what hosts are producers and consumers for this app.
> We need to get the bi-directional flow data into a single object statistic, so we'll aggregate the data for RMON
> data processing (one object, in and out stats), and merge for the " saddr ", then just rasort() on the abr field.
>
> We get a list from Producers to Consumers, and the guys in the middle where the abr approaches 0, and we have
> balanced communications, we see the complete spectrum of data push agents (producers) where ( ABRation > 0.75 )
> on top, and we have the pure data sinks, where the ( ABRatio < -0.75 ), and we've got maybe command
> and control in the ( -0.5 < ABRatio < 0.5 ) range ? Probably need to add a threshold for the amount of
> data sent and received, to weed out the announcers in the command and control network...
>
> I'd go for that set of rules for this specific application, in this observation domain…..
>
> Carter
>
>
>
> On May 6, 2013, at 6:34 PM, John Gerth <gerth at graphics.stanford.edu> wrote:
>
>> Yes, well an abr of 0.0, even without using -0.0 can have multiple meanings
>> since you will get 0 as long as s=d even if they are large. The use of -0.0
>> is perhaps a bit cute, and, as you point out, since IEEE 754 requires 0.0 == -0.0
>> in all relational tests, one has to use signbit() to disambiguate in C.
>>
>> Even so, I think your example shows that abr has good potential as a metric.
>>
>> John Gerth gerth at graphics.stanford.edu Gates 378
>>
>> On 5/6/2013 2:09 PM, Carter Bullard wrote:
>>> Hey John,
>>> OK, so there is a problem, in that IEEE floating point has ( -0.0 == 0.0 )
>>> by definition. So, I fixed it in my compiler, but others are going to have
>>> issues when needing to discriminate between 0.0 and -0.0.
>>>
>>> So, with the new argus-clients, you can filter for any value for abr, using
>>> any of the ra* programs. Still have to work on graphing abr, but I'll get
>>> to that later tonight.
>>>
>>> Here is the abr behavior for every DNS request here at QoSient World HQ
>>> for 2013, that weren't ServFail errors:
>>>
>>> thoth:tmp carter$ rahisto -H abr 10:-1.0-1.0 -r argus*domain* -s mean stddev - src pkts 1 and dst pkts 1 and not abr 0.0
>>> N = 1009841 mean = -0.739084 stddev = 0.102829 max = -0.162791 min = -0.909605
>>> median = -0.749129 95% = -0.653846
>>> mode = -0.782609
>>> Class Interval Freq Rel.Freq Cum.Freq Mean StdDev
>>> 1 -1.000000e+00 225379 22.3183% 22.3183% -0.815238 0.000650
>>> 2 -8.000000e-01 738148 73.0955% 95.4137% -0.740887 0.043837
>>> 3 -6.000000e-01 10553 1.0450% 96.4588% -0.534672 0.048717
>>> 4 -4.000000e-01 35511 3.5165% 99.9752% -0.283067 0.021374
>>> 5 -2.000000e-01 250 0.0248% 100.0000% -0.162791 0.000051
>>> 6 0.000000e+00 0 0.0000% 100.0000%
>>> 7 2.000000e-01 0 0.0000% 100.0000%
>>> 8 4.000000e-01 0 0.0000% 100.0000%
>>> 9 6.000000e-01 0 0.0000% 100.0000%
>>> 10 8.000000e-01 0 0.0000% 100.0000%
>>>
>>>
>>> So, anything positive would be a behavioral anomaly, from the perspective of this host.
>>>
>>> ra -r file.from.the.host - udp and port domain and src pkts 1 and dst pkts 1 and abr gt 0.0
>>>
>>> and this would pick out candidates for DNS server availability errors:
>>>
>>> ra -r file.from.the.host - udp and port domain and src pkts 1 and dst pkts 1 and abr eq 0.0
>>>
>>> Of course, you can do an analysis of every service, and get a rather interesting set of
>>> " what is normal " behaviors, using this simple type of metric. For those services where
>>> the abr is always positive or always negative, seeing a shift to the other side, can
>>> indicate events that should be of interest.
>>>
>>> Hope this is helpful,
>>>
>>> Carter
>>>
>>>
>>> On May 6, 2013, at 2:41 PM, Carter Bullard <carter at qosient.com <mailto:carter at qosient.com>> wrote:
>>>
>>>> Hey John,
>>>> If no appbytes, currently we return -0.0, but the library knows if there are
>>>> appbytes or not, so we can return nada, when printing out the values.
>>>> Right now, when using xml format, you won't get a value.
>>>>
>>>> Having problems getting my compiler to tell the difference between 0.0 and -0.0,
>>>> but should hopefully have this working by this afternoon.
>>>>
>>>> Carter
>>>>
>>>>
>>>> On May 6, 2013, at 2:37 PM, John Gerth <gerth at graphics.stanford.edu <mailto:gerth at graphics.stanford.edu>> wrote:
>>>>
>>>>> Nice example. I'm looking forward to using this.
>>>>>
>>>>> As your example shows, this metric is available for any existing argus
>>>>> files that were created containing appbyte values. I'm assuming that if
>>>>> the sensor wasn't configured to capture those, 'abr' is not available.
>>>>>
>>>>>
>>>>> --
>>>>> John Gerth gerth at graphics.stanford.edu <mailto:gerth at graphics.stanford.edu> Gates 378 (650) 725-3273 fax 725-6949
>>>>>
>>>>> On 5/6/2013 11:19 AM, Carter Bullard wrote:
>>>>>> Hey John,
>>>>>> OK, so I've implemented " abr " as a new metric, using our normalized equation:
>>>>>>
>>>>>> abr = (sappbytes - dappbytes)/(sappbytes + dappbytes)
>>>>>>
>>>>>> This generates values between +1.0 - -1.0. +1.0 means that all the app bytes
>>>>>> were from the source, indicating that the source is a pure PRODUCER, and the
>>>>>> destination is a pure CONSUMER. You see this in FTP PUT file transfers,
>>>>>> as an example. The sign bit reverses this relationship.
>>>>>>
>>>>>> -0.0 denotes the special case, when there are no appbytes seen.
>>>>>>
>>>>>> In the new argus-clients that I'll put up later today, you can print this out using:
>>>>>>
>>>>>> ra -r argus.data -s +abr
>>>>>>
>>>>>> You can also do operations using this metric, such as filter and generate histograms.
>>>>>> Here is a run that I did to show how this maybe used in an anomaly detection
>>>>>> application. Here is the simple frequency distribution for all the internal DNS
>>>>>> requests made to my local DNS server from a specific client, for all of 2013:
>>>>>>
>>>>>> thoth:06 carter$ pwd
>>>>>> /Volumes/Data/Archive/QoSient/192.168.0.68/2013
>>>>>> thoth:tmp carter$ rahisto -H abr 10:-1.0-1.0 -R . -s mean stddev - udp port domain and src pkts 1 and dst pkts 1
>>>>>> N = 1027764 mean = -0.726195 stddev = 0.140532 max = 0.000000 min = -0.909605
>>>>>> median = -0.749129 95% = -0.292517
>>>>>> mode = -0.782609
>>>>>> Class Interval Freq Rel.Freq Cum.Freq Mean StdDev
>>>>>> 1 -1.000000e+00 225379 21.9291% 21.9291% -0.815238 0.000650
>>>>>> 2 -8.000000e-01 738148 71.8208% 93.7498% -0.740887 0.043837
>>>>>> 3 -6.000000e-01 10553 1.0268% 94.7766% -0.534672 0.048717
>>>>>> 4 -4.000000e-01 35511 3.4552% 98.2318% -0.283067 0.021374
>>>>>> 5 -2.000000e-01 250 0.0243% 98.2561% -0.162791 0.000051
>>>>>> 6 0.000000e+00 17923 1.7439% 100.0000% 0.000000 0.000000
>>>>>> 7 2.000000e-01 0 0.0000% 100.0000%
>>>>>> 8 4.000000e-01 0 0.0000% 100.0000%
>>>>>> 9 6.000000e-01 0 0.0000% 100.0000%
>>>>>> 10 8.000000e-01 0 0.0000% 100.0000%
>>>>>>
>>>>>>
>>>>>> OK, should be very clear, that my host is a net CONSUMER of DNS data, not a net PRODUCER
>>>>>> because the " abr <= 0 ". The corollary holds true, the local DNS service is a net PRODUCER of
>>>>>> data, and not a net CONSUMER of data, from the prospective of this particular end system.
>>>>>> So testing filters like this:
>>>>>> ra -r daily.file - abr gt 0 and port domain and src pkts 1 and dst pkts 1
>>>>>>
>>>>>> Should reveal flows that deserve a closer look.
>>>>>>
>>>>>> OK, there were a lot of flows where the ( abr == 0 ), which was surprising.
>>>>>> When DNS experiences a ServFail, the response is the same as the request, just with an error bit
>>>>>> set in the DNS header. QoSient had a big issue in Jan, 2013, when 17923 DNS ServFail failures
>>>>>> occurred, so that is where the ( abr == 0 ) flows occured. Important to know this when evaluating
>>>>>> DNS as a channel for CONSUMER to PRODUCER conversion.
>>>>>>
>>>>>> But for DNS health and operability, looking for flows where the ( sappbytes == dappbytes ) is
>>>>>> also a pretty interesting thing to look for.
>>>>>>
>>>>>> Hope this is helpful,
>>>>>>
>>>>>> Carter
>>>>>>
>>>>>> Carter Bullard
>>>>>> CEO/President
>>>>>> QoSient, LLC
>>>>>> 150 E. 57th Street Suite 12D
>>>>>> New York, New York 10022
>>>>>>
>>>>>> +1 212 588-9133 Phone
>>>>>> +1 212 588-9134 Fax
>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20130529/61703482/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 6837 bytes
Desc: not available
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20130529/61703482/attachment.bin>
More information about the argus
mailing list