ArgusEncode32() accepts little endian?

Matt Brown matthewbrown at gmail.com
Wed Jul 17 10:43:20 EDT 2013


Thanks for your reply again.

If the afp.c definition for "AFP: DSI OpenSession detected." is as noted
previously, then the full ArgusEncode32() "output" string would be derived
as:
where data are assigned:
byte offset:    00 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 10 11
data:         00 04 00 01 00       00 00             00 00 00 00 04 01


I think I assumed it to be the opposite placement (from the right,
subnetting got me), but this was just a matter of not understanding order
ArgusEncode32() handles generation of the string, I assume.

Does the abovelook good?


Can you also describe how the string "encrypted" is used and when to use it?

Also, what would be a reasonable ("n") weight to give definitions from nDPI
or other protocol identification classes?  Or, let me ask that
differently... how is the ("n") weight used?  Does raservices() simply
consider the weight relative to other lines of the same "service" in the
.conf, or is weight considered via some threshold that considers the
algorithm used by rauserdata()?
I'm "randomly" guessing, if raservices() can say that a byte pattern that
is pulled from nDPI is matched, then with about _90% certainty_ it is this
service.  How can I express _90% certainty_ with a given "n" value?

Including derived definitions of byte patterns from protocol identification
classes plus the machine learning algo of rauserdata (and the user
tweaking) will make raservices() much more useful, in my opinion.


Looking forward to get cracking on this.
May also look into seeing if I can generate an raservices().conf from
libprotoident (particularly after reading this study http
://vbn.aau.dk/files/78068418/report.pdf<http://vbn.aau.dk/files/78068418/report.pdf>
).


Thanks Carter,

Matt

On Jul 16, 2013, at 7:01 PM, Carter Bullard <carter at qosient.com> wrote:

Hey Matt,
So, understand that your efforts can be described as trying to add
to the /usr/local/argus/std.sig file that we provide in the clients
distribution.

The std.sig file has real signatures for a number of common protocols.
To build your own signatures by hand, you need to understand what
the patterns actually mean.  Lets use one of the signatures for " imap "
as an example:

Service: imap   tcp port 143   n = 48745 src = "444F4E450D0A
     "
                                         dst = "
 204F4B2049444C4520636F6D"

So the signature provides the service label, in this case "imap".
We expect the signature to be seen in tcp traffic going to port 143.
We processed 48745 imap connections, and we analyzed the first
16 bytes of the user buffers and found that the source presented a
bit pattern of 0x44F4E450D0A as the first bits in the sampled payload
of the connection, this pattern is ascii "DONE\n". The 7-16 bytes were
variable, and so are not in the signature.

The destination in this case had sent payloads where the first 4 bytes
were variable, and bytes 5 -16 were "OK IDLE com".

This is the most frequent pattern in the 32 imap payload signatures
that we have, representing about 60% of all user buffers capture for
imap traffic.

These signatures are normally generated from argus data of the
service streams of interest, using the program rauserdata().  So one
of the best strategies is to run rauserdata() against your argus logs,
so that it can generate a starter signature file, and then by hand,
improve the signatures until your happy.

The signatures that raservices() uses are rather special patterns, that
represent the persistent bits seen in the user payload samples that argus
captures.  The best results are seen from signatures built from the the
first
16-32 bytes of the entire flow, but there is a great deal of benefit from
analyzing and comparing the samples of payload data that are captured
in the status records.

Remember, all data on the wire should be in network order, unless its
unstructured, and then you should treat it as a bit stream, so there isn't
any endian-ness.

So for your example I would start with something like this:
   src = "0004000100              "


Carter


On Jul 16, 2013, at 10:17 AM, Matt Brown <matthewbrown at gmail.com> wrote:

Thanks for the reply, Carter.

Can you provide any assistance in relation to "translating" the values
given in nDPI classes to the character based hex strings needed for "src ="
and "dst ="?


For instance, if I take an example from afp.c (
https://svn.ntop.org/svn/ntop/trunk/nDPI/src/lib/protocols/afp.c), the
following qualifies "AFP: DSI OpenSession detected."

//from ndpi_protocols.h
https://svn.ntop.org/svn/ntop/trunk/nDPI/src/include/ndpi_protocols.h
#define get_u_int16_t (X,O)  (*(u_int16_t *)(((u_int8_t *)X) + O))
#define get_u_int32_t (X,O)  (*(u_int32_t *)(((u_int8_t *)X) + O))

get_u_int16_t(packet->payload, 0) == htons(0x0004) &&  //if the 16 bits
starting at byte-offset 0 (meaning, bits 0 through 15) of the payload
equals the 16 bit little endian "0x0004" and...
get_u_int16_t(packet->payload, 2) == htons(0x0001) &&  //if the 16 bits
starting at byte-offset 2 (meaning, bits 16 through 31) of the payload
equals the 16 bit little endian "0x0001" and...
get_u_int32_t(packet->payload, 4) == 0 && //if the 32 bits starting at
byte-offset 4 (meaning bits 32-63) of the payload equals 0 and...
get_u_int32_t(packet->payload, 8) == htonl(packet->payload_packet_len - 16)
&& //if the 32 bits at byte-offset 8 (meaning, bits 64-95) are the same as
a 32-bit little endian value equal to the size of the packet minus 16 [must
be a check of sorts] and...
get_u_int32_t(packet->payload, 12) == 0 && //if the 32 bits at byte-offset
12 (bits 96-127) equals 0 and...
get_u_int16_t(packet->payload, 16) == htons(0x0104)) //if the 16 bits at
byte-offset 16 (bits 128-144)


I've commented what I can see as the byte offsets of the given data.

So, I'd simply like to generate the "src = " and "dst = " from this
conditional.


I had some assistance reviewing ArgusEncode32() and it was explained that
it looks at a ptr for binary data and "outputs" that data in a string of
hex.
Knowing that 0x0004, as it is expressed in the nDPI class, is little
endian...
- I believe that if I were to execute ArgusEncode32() with a pointer to
data that can be expressed as hex 0x0004, it would output the string "
00000004".
- I could then use this to build an effective "src = " line for an
raservices.conf file.
Are these two assumptions correct?

With this technique, do you think it's reasonable to generate an
raservices.conf from all the conditionals included in the nDPI classes?


Thanks,

Matt



On Jul 15, 2013, at 7:18 PM, Carter Bullard <carter at qosient.com> wrote:

The ArgusEncode32() printer works on a character basis, so there
isn't any notion of big endian or little endian.

The "n=" is how many samples were used to generate the signature.
We rank them by "n", as a weight for the probability of encountering
that particular pattern.

Carter

On Jul 15, 2013, at 1:15 PM, Matt Brown <matthewbrown at gmail.com> wrote:

Carter,


Hope all is well.  Last Thursday I started to look into reversing the

nDPI classes and creating an raservices() conf file from the byte

pattern classification definitions therein.


I struggled to understand the c notation, etc, but have arrived on the

question of whether or not ArgusEncode32() takes a little endian data

value as input and "outputs" this data expressed as a string made up

of its value in hex.


For instance, if I take a value from afp.c (within nDPI) and see

htons(0x0004), I can assume that when converted with ArgusEncode32(),

the "output" will be "00000004".


Out of this, I can then generate the "src=" or "dst=" portions of a

line for an raservices() conf file.


Is this correct?


Additionally, as for the syntax of the raservices() conf file, what

does the "n=" value mean?



Thanks,


Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20130717/068e5650/attachment.html>


More information about the argus mailing list