rapath
Carter Bullard
carter at qosient.com
Tue Apr 28 17:05:57 EDT 2009
Gentle people,
I have finished a large part of the path/topology work in this round
of argus, and I'd
like to describe it hear, to get your feedback/response.
One of the key features of argus() is that it tracks Type-P1-P2 flows.
This means that argus()'s bi-directional flow model allows it to match
packets (P1)
of one type of flow (src addr, dst addr, proto, src port, dst port)
with packets from
another type of flow (P2).
One of the benefits of this design, is that this feature allows
argus() to match
ICMP packets (P2) to the flows that they relate to (P1). I use this
basic function
in all types of analytics, such as reachability failure analysis,
policy verification,
discovery detection and advanced path assurance measures, just to name
a few.
Because of this simple mechanism, argus() ends up capturing any
traffic that
can function as a traceroute() in the network, regardless of the
protocol used.
This is the basis of argus()'s Discovery Detection Capabilities.
For the novice, traceroute() works by sending IP packets with restricted
Time-to-Live (TTL) values in the IP header. By limiting the TTL,
traceroute()
attempts to get intermediate nodes to generate ICMP Time Exceeded
messages,
thus revealing that they were on the path the packet took. The
reality is that
any kind of packet that can cause an intermediate node to generate any
ICMP
message, can act as a traceroute().
Argus 'maps" ICMP messages to its parent flow, and stores relevant
ICMP information
in the parent flow. Flows that contain mapped ICMP information are
referred to
as "icmpmap". These flows have a 'I' in the flags field. A lot of
information is stored,
including the address of the router/host that generated the ICMP
message (the "inode").
You can print, filter, aggregate, sort etc.... on in the "inode"
field (intermediate node)
and with this info, and argus() TTL information, iprograms like
rapath() can extract
network path information from an argus data set.
You can run programs like racluster() and get some information that is
interesting.
This example will give you the list of intermediate nodes that are
discoverable
using traceroute like technology.
racluster -m inode sttl -r file -w /tmp/inodes.out - icmpmap
and read the data like this:
rasort -m sttl trans -s stime dur inode sttl avgdur stddev mindur
trans -r /tmp/inodes.out
This will give you the list of Intermediate nodes that are
discoverable from your
network traffic, sorted by hop distance. You will want to filter out
stuff that is not
suited to what you want to do with the info.
Here are just a few entries in one of my files:
rasort -m sttl trans -s inode sttl avgdur stddev mindur trans -r /tmp/
ralabel.test
Inode sTtl AvgDur StdDev MinDur Trans
192.168.0.1 1 0.000628 0.000196 0.000308 935
10.22.32.1 2 0.008108 0.005356 0.005194 969
208.59.246.1 3 0.016620 0.033358 0.006429 969
207.172.19.110 4 0.017827 0.035872 0.006419 729
207.172.19.100 4 0.065243 0.084057 0.006799 141
207.172.19.106 4 0.037227 0.064243 0.007110 99
4.71.190.9 5 0.013927 0.020931 0.006442 354
207.172.15.90 5 0.009277 0.002668 0.006919 60
207.172.9.74 5 0.010054 0.004764 0.007315 52
198.32.160.53 5 0.010531 0.006510 0.006918 51
207.172.15.74 5 0.009078 0.002404 0.006923 48
198.32.160.185 5 0.020439 0.044126 0.007081 36
So we've got a number of routers. I traceroute alot, so these are the
first 5 hops
for a diverse set of paths that my traffic takes to get to the Entero
net.
With different options to racluster(), you can get data sets that will
tell you what
IP addresses discovered these intermediate node addresses.
The mindur is the most interesting metric here, as it is closest to
the actual RTT.
The average duration is interesting, but not very useful.
Using these features, and basic aggregation principles, rapath() is
an example
program that uses the ICMP information argus() captures, to find the
traceroute
sessions in an argus data file. It outputs the path that it can
assemble from the data.
In conjunction with the GeoIP database, you can also get the AS
information
for the path as well.
Ok, so lets get to it. rapath() is designed to give you a graph of a
single path, and
a legend of the nodes in the path. The legend has the RTT of the
traceroute, so
you could draw to scale if you liked.
Running rapath() to find a path to K.rootservers.net, against data
that has been
labeled with the originAS numbers from the GeoIP database, using
ralabel(),
I can get output like this:
../bin/rapath -r /tmp/test.out -nn - dst host 193.0.14.129
A -> B -> AS6079:[C -> D -> (E,F) -> G -> H -> I -> J] ->
(AS2914:K,AS6079:L) -> AS6461:[(M,N) -> (O,P) -> Q]
Node SrcAddr Dir DstAddr
Inode iAS sTtl AvgDur StdDev MaxDur MinDur Trans
A 192.168.0.0 -> 193.0.14.129
192.168.0.1 1 0.000512 0.000091 0.000708 0.000419
15
B 192.168.0.0 -> 193.0.14.129
10.22.32.1 2 0.007507 0.000717 0.008441 0.005866 15
C 192.168.0.0 -> 193.0.14.129 208.59.246.1
6079 3 0.012698 0.009256 0.038924 0.007430 15
D 192.168.0.0 -> 193.0.14.129 207.172.19.100
6079 4 0.009786 0.003162 0.019378 0.007429 15
E 192.168.0.0 -> 193.0.14.129 207.172.19.214
6079 5 0.008651 0.000806 0.010263 0.006946 12
F 192.168.0.0 -> 193.0.14.129 207.172.19.99
6079 5 0.011959 0.005769 0.020116 0.007748 3
G 192.168.0.0 -> 193.0.14.129 207.172.19.6
6079 6 0.011071 0.001721 0.014924 0.008430 15
H 192.168.0.0 -> 193.0.14.129 207.172.19.9
6079 7 0.014991 0.002864 0.022407 0.011930 15
I 192.168.0.0 -> 193.0.14.129 207.172.19.192
6079 8 0.046470 0.066420 0.206710 0.011913 15
J 192.168.0.0 -> 193.0.14.129 207.172.19.205
6079 9 0.025824 0.027325 0.122671 0.014935 14
K 192.168.0.0 -> 193.0.14.129 206.223.115.86
2914 10 0.015746 0.001164 0.017947 0.013613 12
L 192.168.0.0 -> 193.0.14.129 207.172.9.66
6079 10 0.020273 0.006145 0.028944 0.015435 3
M 192.168.0.0 -> 193.0.14.129 64.125.26.237
6461 11 0.016422 0.001180 0.018782 0.014858 12
N 192.168.0.0 -> 193.0.14.129 64.125.26.241
6461 11 0.019062 0.002344 0.022377 0.017368 3
O 192.168.0.0 -> 193.0.14.129 64.125.26.173
6461 12 0.016568 0.001133 0.019412 0.014915 12
P 192.168.0.0 -> 193.0.14.129 64.125.27.166
6461 12 0.090722 0.001558 0.092909 0.089403 3
Q 192.168.0.0 -> 193.0.14.129 64.125.31.185
6461 13 0.092419 0.008442 0.117515 0.087780 12
So what have we got.
A -> B -> AS6079:[C -> D -> (E,F) -> G -> H -> I -> J] ->
(AS2914:K,AS6079:L) -> AS6461:[(M,N) -> (O,P) -> Q]
The path is "A -> B", but these nodes are private, so there are no
originAS numbers for them.
The next hops are "B -> AS6079:[C -> D -> (E,F) -> G -> H -> I -> J]".
So, B forward its data to AS6079, which uses C, D, E, F, G, H, I, and
J, to forward the traffic.
At hop 5, the ISP used either node E or F. At hop 10, data either
left AS6079 to go to
node K, in AS2914, or it staid in AS6079, going to node L. By hop 11,
all the traffic had
found its way to AS6461, which is where the destination sits (at least
this destination).
OK, there is a lot more to this, and there are bugs, so, if you find
this interesting, please send email to the list.
I'll try to assemble a manpage for rapath() soon, but I wanted to get
some feedback first, to see if there was
interest.
Hope all is most excellent,
Carter
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20090428/036c76b4/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3815 bytes
Desc: not available
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20090428/036c76b4/attachment.bin>
More information about the argus
mailing list