normalized appbyte ratio for producer/consumer relationship
John Gerth
gerth at graphics.stanford.edu
Mon May 6 18:34:12 EDT 2013
Yes, well an abr of 0.0, even without using -0.0 can have multiple meanings
since you will get 0 as long as s=d even if they are large. The use of -0.0
is perhaps a bit cute, and, as you point out, since IEEE 754 requires 0.0 == -0.0
in all relational tests, one has to use signbit() to disambiguate in C.
Even so, I think your example shows that abr has good potential as a metric.
John Gerth gerth at graphics.stanford.edu Gates 378
On 5/6/2013 2:09 PM, Carter Bullard wrote:
> Hey John,
> OK, so there is a problem, in that IEEE floating point has ( -0.0 == 0.0 )
> by definition. So, I fixed it in my compiler, but others are going to have
> issues when needing to discriminate between 0.0 and -0.0.
>
> So, with the new argus-clients, you can filter for any value for abr, using
> any of the ra* programs. Still have to work on graphing abr, but I'll get
> to that later tonight.
>
> Here is the abr behavior for every DNS request here at QoSient World HQ
> for 2013, that weren't ServFail errors:
>
> thoth:tmp carter$ rahisto -H abr 10:-1.0-1.0 -r argus*domain* -s mean stddev - src pkts 1 and dst pkts 1 and not abr 0.0
> N = 1009841 mean = -0.739084 stddev = 0.102829 max = -0.162791 min = -0.909605
> median = -0.749129 95% = -0.653846
> mode = -0.782609
> Class Interval Freq Rel.Freq Cum.Freq Mean StdDev
> 1 -1.000000e+00 225379 22.3183% 22.3183% -0.815238 0.000650
> 2 -8.000000e-01 738148 73.0955% 95.4137% -0.740887 0.043837
> 3 -6.000000e-01 10553 1.0450% 96.4588% -0.534672 0.048717
> 4 -4.000000e-01 35511 3.5165% 99.9752% -0.283067 0.021374
> 5 -2.000000e-01 250 0.0248% 100.0000% -0.162791 0.000051
> 6 0.000000e+00 0 0.0000% 100.0000%
> 7 2.000000e-01 0 0.0000% 100.0000%
> 8 4.000000e-01 0 0.0000% 100.0000%
> 9 6.000000e-01 0 0.0000% 100.0000%
> 10 8.000000e-01 0 0.0000% 100.0000%
>
>
> So, anything positive would be a behavioral anomaly, from the perspective of this host.
>
> ra -r file.from.the.host - udp and port domain and src pkts 1 and dst pkts 1 and abr gt 0.0
>
> and this would pick out candidates for DNS server availability errors:
>
> ra -r file.from.the.host - udp and port domain and src pkts 1 and dst pkts 1 and abr eq 0.0
>
> Of course, you can do an analysis of every service, and get a rather interesting set of
> " what is normal " behaviors, using this simple type of metric. For those services where
> the abr is always positive or always negative, seeing a shift to the other side, can
> indicate events that should be of interest.
>
> Hope this is helpful,
>
> Carter
>
>
> On May 6, 2013, at 2:41 PM, Carter Bullard <carter at qosient.com <mailto:carter at qosient.com>> wrote:
>
>> Hey John,
>> If no appbytes, currently we return -0.0, but the library knows if there are
>> appbytes or not, so we can return nada, when printing out the values.
>> Right now, when using xml format, you won't get a value.
>>
>> Having problems getting my compiler to tell the difference between 0.0 and -0.0,
>> but should hopefully have this working by this afternoon.
>>
>> Carter
>>
>>
>> On May 6, 2013, at 2:37 PM, John Gerth <gerth at graphics.stanford.edu <mailto:gerth at graphics.stanford.edu>> wrote:
>>
>>> Nice example. I'm looking forward to using this.
>>>
>>> As your example shows, this metric is available for any existing argus
>>> files that were created containing appbyte values. I'm assuming that if
>>> the sensor wasn't configured to capture those, 'abr' is not available.
>>>
>>>
>>> --
>>> John Gerth gerth at graphics.stanford.edu <mailto:gerth at graphics.stanford.edu> Gates 378 (650) 725-3273 fax 725-6949
>>>
>>> On 5/6/2013 11:19 AM, Carter Bullard wrote:
>>>> Hey John,
>>>> OK, so I've implemented " abr " as a new metric, using our normalized equation:
>>>>
>>>> abr = (sappbytes - dappbytes)/(sappbytes + dappbytes)
>>>>
>>>> This generates values between +1.0 - -1.0. +1.0 means that all the app bytes
>>>> were from the source, indicating that the source is a pure PRODUCER, and the
>>>> destination is a pure CONSUMER. You see this in FTP PUT file transfers,
>>>> as an example. The sign bit reverses this relationship.
>>>>
>>>> -0.0 denotes the special case, when there are no appbytes seen.
>>>>
>>>> In the new argus-clients that I'll put up later today, you can print this out using:
>>>>
>>>> ra -r argus.data -s +abr
>>>>
>>>> You can also do operations using this metric, such as filter and generate histograms.
>>>> Here is a run that I did to show how this maybe used in an anomaly detection
>>>> application. Here is the simple frequency distribution for all the internal DNS
>>>> requests made to my local DNS server from a specific client, for all of 2013:
>>>>
>>>> thoth:06 carter$ pwd
>>>> /Volumes/Data/Archive/QoSient/192.168.0.68/2013
>>>> thoth:tmp carter$ rahisto -H abr 10:-1.0-1.0 -R . -s mean stddev - udp port domain and src pkts 1 and dst pkts 1
>>>> N = 1027764 mean = -0.726195 stddev = 0.140532 max = 0.000000 min = -0.909605
>>>> median = -0.749129 95% = -0.292517
>>>> mode = -0.782609
>>>> Class Interval Freq Rel.Freq Cum.Freq Mean StdDev
>>>> 1 -1.000000e+00 225379 21.9291% 21.9291% -0.815238 0.000650
>>>> 2 -8.000000e-01 738148 71.8208% 93.7498% -0.740887 0.043837
>>>> 3 -6.000000e-01 10553 1.0268% 94.7766% -0.534672 0.048717
>>>> 4 -4.000000e-01 35511 3.4552% 98.2318% -0.283067 0.021374
>>>> 5 -2.000000e-01 250 0.0243% 98.2561% -0.162791 0.000051
>>>> 6 0.000000e+00 17923 1.7439% 100.0000% 0.000000 0.000000
>>>> 7 2.000000e-01 0 0.0000% 100.0000%
>>>> 8 4.000000e-01 0 0.0000% 100.0000%
>>>> 9 6.000000e-01 0 0.0000% 100.0000%
>>>> 10 8.000000e-01 0 0.0000% 100.0000%
>>>>
>>>>
>>>> OK, should be very clear, that my host is a net CONSUMER of DNS data, not a net PRODUCER
>>>> because the " abr <= 0 ". The corollary holds true, the local DNS service is a net PRODUCER of
>>>> data, and not a net CONSUMER of data, from the prospective of this particular end system.
>>>> So testing filters like this:
>>>> ra -r daily.file - abr gt 0 and port domain and src pkts 1 and dst pkts 1
>>>>
>>>> Should reveal flows that deserve a closer look.
>>>>
>>>> OK, there were a lot of flows where the ( abr == 0 ), which was surprising.
>>>> When DNS experiences a ServFail, the response is the same as the request, just with an error bit
>>>> set in the DNS header. QoSient had a big issue in Jan, 2013, when 17923 DNS ServFail failures
>>>> occurred, so that is where the ( abr == 0 ) flows occured. Important to know this when evaluating
>>>> DNS as a channel for CONSUMER to PRODUCER conversion.
>>>>
>>>> But for DNS health and operability, looking for flows where the ( sappbytes == dappbytes ) is
>>>> also a pretty interesting thing to look for.
>>>>
>>>> Hope this is helpful,
>>>>
>>>> Carter
>>>>
>>>> Carter Bullard
>>>> CEO/President
>>>> QoSient, LLC
>>>> 150 E. 57th Street Suite 12D
>>>> New York, New York 10022
>>>>
>>>> +1 212 588-9133 Phone
>>>> +1 212 588-9134 Fax
>>>>
>>>
>>
>
More information about the argus
mailing list