Feature request: grep hex strings with -e

Dave Edelman dedelman at iname.com
Fri Oct 12 17:26:03 EDT 2012


The PCRE library is available and it comes with several wrappers, one of
which is a drop in replacement for the POSIX tools.

Off-list, I exchanged emails with Carter and as an experiment, I actually
had very little problem getting this to work with the latest set of clients.
There are a few things that took a bit of adjustment and there is one thing
that requires more work, but examples show it best. 

I am using ra(1) from argus-clients-3.0.7.3 but all the clients that support
-e use the same code. The version of PCRE is 8.31 I have this running on
FC14 and FC17, I will try it on the Mac over the weekend.

The simple case was a test for nothing getting broken, Finding a plain ASCII
string was no problem

ra -M printer="hex"  -r argus.out -s suser:60 -e 'DevMgmt'
 
 
      0x0000     4745 5420 2f44 6576 4d67 6d74 2f50 726f
GET./DevMgmt/Pro
      0x0010     6475 6374 5374 6174 7573 4479 6e2e 786d
ductStatusDyn.xm
      0x0020     6c20 4854 5450 2f31 2e31 0d0a 486f 7374
l.HTTP/1.1..Host
      0x0030     3a20 6c6f 6361 6c68 6f73 740d                  :.localhost.
 
 
Checking how well it did looking for escaped hex values was next. Again no
problem.

ra -M printer="hex"  -r argus.out -s suser:60 -e 'DevMgmt.*\x78\x6d'
 
 
      0x0000     4745 5420 2f44 6576 4d67 6d74 2f50 726f
GET./DevMgmt/Pro
      0x0010     6475 6374 5374 6174 7573 4479 6e2e 786d
ductStatusDyn.xm
      0x0020     6c20 4854 5450 2f31 2e31 0d0a 486f 7374
l.HTTP/1.1..Host
      0x0030     3a20 6c6f 6361 6c68 6f73 740d                  :.localhost.
 

Getting a bit more ambitious, attempted to cross the 7 to 8 bit value line
and that wasn't so good:

This is in the file and it includes hex values greater than 7F
 
      0x0000     230f 06fb 0000 0000 0000 0000 0000 0000
#...............
      0x0010     0000 0000 0000 0000 0000 0000 0000 0000
................
      0x0020     0000 0000 0000 0000 d422 b1ce 9a1d a800
........."......
 
ra -M printer="hex"  -r argus.out -s suser:60 -e
'\xd4\x22\xb1\xce\x9a\x1d\xa8\x00'
The search fails.

A bit of RTFM and I learn that when you build PCRE, you must use the
--enable-utf option for ./configure By this time my test file was archived
and I had to look for something else.


ra -M printer="hex"  -r argus.out -s suser:1000 -e '\xff.*\x98'
 
 
      0x0000     b1f0 9bdb 4ab3 0c83 5980 8531 0271 c099
....J...Y..1.q..
      0x0010     109c ff33 3bb8 db57 dad4 ff77 541d 9eec
...3;..W...wT...
      0x0020     9e14 9027 263d c6aa 4c73 95cb 670b 29f0
...'&=..Ls..g.).
      0x0030     066e e3c5 d3b5 5083 a945 317c e026 cb7d
.n....P..E1|.&.}
      0x0040     d88f bbfb 75fb e1b9 2ef3 4ab9 01a0 b4bc
....u.....J.....
      0x0050     7063 0f20 4476 0c9f b231 08b6 9d74 aecb
pc..Dv...1...t..
      0x0060     2d09 69ed b4b5 6956 6e65 e9d7 5e3c 764f
-.i...iVne..^<vO
      0x0070     32fd 6961 0ecb 1e2c 16ad 38f9 4a95 86f9
2.ia...,..8.J...
      0x0080     8815 d697 d19f 3a49 df52 829c 98c5 adb9
......:I.R......
      0x0090     0278 5bf0 8859 5de7 322e 18dc 7167 df9b
.x[..Y].2...qg..
      0x00a0     4fd0 308f f52c 8c22 68fe 748f 0f63 d215
O.0..,."h.t..c..
      0x00b0     9681 664b 4ecc bef6 f18e cc15 8dce 4caf
..fKN.........L.
      0x00c0     e20f 8d07 7c67 31db 2759 f360 596e 1194
....|g1.'Y.`Yn..
      0x00d0     1e2c 6909 b003 731e 6909 eedb 3bd6 92fe
.,i...s.i...;...
      0x00e0     8a18 6cd8 800c 8643 25af 2655 6c4f f0c9
..l....C%.&UlO..
      0x00f0     ba76 b638 601e 2787 17a6 69d5 a773 5894
.v.8`.'...i..sX.
      0x0100     302a 4085 a1d8 4153 31e0 358a 0b9a 2f65
0*@...AS1.5.../e
      0x0110     933b 4bad ae63 d0a1 e7d6 70bf 2df8 15c2
.;K..c....p.-...
      0x0120     fcc5 fa4b 4ed3 5d56 aff0 424f 6e71 5901
...KN.]V..BOnqY.
      0x0130     8368 e4f7 1b3f f7db 31d3 8c0e ccdd d734
.h...?..1......4
      0x0140     893b 2c0b 0b14 356b 315e e9bb d538 d013
.;,...5k1^...8..
      0x0150     8f8c d328 65ec 488a f516 abe3 9e2a 82c2
...(e.H......*..
      0x0160     9460 e36f 1177 3609 e6bd f241 1a4b 3571
.`.o.w6....A.K5q
      0x0170     f7eb 1fc0 f938 28ac ac3e 56de b576 971e
.....8(..>V..v..
      0x0180     46a0 d101 c052 b98a 1fc2 0c0b 56af 71b1
F....R......V.q.
      0x0190     e8ac 1bc0 f6f5 42b0 a988 ce50 ff76 13b8
......B....P.v..
      0x01a0     e2cf 436e e674 651b a587 b22f ae4d 0689
..Cn.te..../.M..
      0x01b0     0788 da56 960e 38a7 bd3c 0d24 63ef 7c61
...V..8..<.$c.|a

Now that it works, I wanted to see something a bit more complex. The regex
in this case is a very accurate IPv4 address recognizer for a specific class
of addresses. In this case I was also testing multi-line searches. The
CACHE-CONTROL request header is separated from the IP address in the
LOCATION header by a \x0d\x0a indicating a new line. The regexp uses
alternation, grouping without capture, repeat counts, escaped characters and
more than a bit of stamina :)


ra -r anonargus.2012.10.12.17.00.01.0.gz -M printer="hex" -s + suser:1000\
 -e
'CACHE.*(?:(?:1[0-9][0-9]\.)|(?:[1-9][0-9]\.)|(?:[1-9]\.)|(?:20[0-9]|21[0-9]
|22[0-3]\.))(?:(?:1[0-9][0-9]\.)|(?:[1-9][0-9]\.)|(?:[0-9]\.)|(?:25[0-5])|(?
:2[0-4][0-9]\.)){2}
(?:(?:1[0-9][0-9])|(?:[1-9][0-9])|(?:[1-9])|(?:25[0-4]|2[0-4][0-9]))'
Fri 2012-10-12 17:59:36.615097  e                   udp
1.0.2.13.23403                   ->          224.0.2.1.39787
1        428   INT

      0x0000     4e4f 5449 4659 202a 2048 5454 502f 312e
NOTIFY.*.HTTP/1.
      0x0010     310d 0a48 4f53 543a 2032 3339 2e32 3535
1..HOST:.239.255
      0x0020     2e32 3535 2e32 3530 3a31 3930 300d 0a43
.255.250:1900..C
      0x0030     4143 4845 2d43 4f4e 5452 4f4c 3a20 6d61
ACHE-CONTROL:.ma
      0x0040     782d 6167 653d 3130 300d 0a4c 4f43 4154
x-age=100..LOCAT
      0x0050     494f 4e3a 2068 7474 703a 2f2f 3136 392e
ION:.http://169.
      0x0060     3235 342e 362e 3636 3a34 3931 3532 2f6e
254.6.66:49152/n
      0x0070     6173 6465 7669 6365 2e78 6d6c 0d0a 4e54
asdevice.xml..NT
      0x0080     3a20 7572 6e3a 7363 6865 6d61 732d 6d69
:.urn:schemas-mi
      0x0090     6372 6f73 6f66 742d 636f 6d3a 7365 7276
crosoft-com:serv
      0x00a0     6963 653a 4e55 4c4c 3a31 0d0a 4e54 533a
ice:NULL:1..NTS:
      0x00b0     2073 7364 703a 616c 6976 650d 0a53 4552
.ssdp:alive..SER
      0x00c0     5645 523a 204c 696e 7578 2f32 2e36 2e33
VER:.Linux/2.6.3
      0x00d0     322e 3131 2d73 766e 3136 3630 362c 2055
2.11-svn16606,.U
      0x00e0     506e 502f 312e 302c 2050 6f72 7461 626c
PnP/1.0,.Portabl
      0x00f0     6520 5344 4b20 666f 7220 5550 6e50 2064
e.SDK.for.UPnP.d
      0x0100     6576 6963 6573 2f31 2e36 2e36 0d0a 582d
evices/1.6.6..X-
      0x0110     5573 6572 2d41 6765 6e74 3a20 7265 6473
User-Agent:.reds
      0x0120     6f6e 6963 0d0a 5553 4e3a 2075 7569 643a
onic..USN:.uuid:
      0x0130     3733 3635 3637 3631 2d37 3436 352d 3733
73656761-7465-73
      0x0140     3735 2d36 3336 622d 3030 3930 6139 6636
75-636b-0090a9f6
      0x0150     3263 3239 3a3a 7572 6e3a 7363 6865 6d61
2c29::urn:schema
      0x0160     732d 6d69 6372 6f73 6f66 742d 636f 6d3a
s-microsoft-com:
      0x0170     7365 7276 6963 653a 4e55 4c4c 3a31 0d0a
service:NULL:1..
      0x0180     0d0a                                           ..

And now the bit that still needs some work. The PCRE POSIX wrapper uses
strlen() and null values in the data stream are not a good thing. I've
looked at the wrapper code and I've looked at the code for argus_grep.c and
given a choice, replacing the posix wrapper calls in argus_grep.c  with real
PCRE calls seems to be a better idea than changing the wrapper. This does
present a problem for people who don't have PCRE and who do not wish to
install it. I don't think it makes sense to create both alternatives in a
single file with preprocessor conditionals. Does it make sense to leave
argus_grep.c (including the enhanced option for the Macintosh users) and
create argus_pcre.c in parallel providing a build option to select one or
the other?  That would make all of the PCRE options available. The
alternative is being able to search up to the first NULL and nothing more:

ra -r anonargus.2012.10.12.17.00.01.0.gz -M printer="hex" -s + suser:100 -e
'\xe4\x05\x40'
                     StartTime      Flgs          Proto       TcpOpt
SrcAddr                Sport   Dir            DstAddr                Dport
TotPkts   TotBytes
State                State
Fri 2012-10-12 17:59:06.466956  e                   udp
1.0.2.12.netbios-ns              ->           1.0.2.11.netbios-ns
3        330
  REQ
      0x0000     e405 4000 0001 0000 0000 0001 2046 4845
.. at ..........FHE
      0x0010     5046 4345 4c45 4846 4345 5046 4646 4143
PFCELEHFCEPFFFAC
      0x0020     4143 4143 4143 4143 4143 4141 4100 0020
ACACACACACAAA...
      0x0030     0001 c00c 0020 0001 0004 93e0 0006 e000
................
      0x0040     0a01 011f e405 4000 0001 0000 0000 0001
...... at .........
      0x0050     2046 4845 5046 4345 4c45 4846 4345 5046
.FHEPFCELEHFCEPF
      0x0060     4646 4143
FFAC

[root at rodnel-new anondata]# ra -r anonargus.2012.10.12.17.00.01.0.gz -M
printer="hex" -s + suser:100 -e '\xe4\x05\x40\x00'
Nothing found.

Thoughts anyone?

--Dave

Apologies to everyone seeing this line wrapped in a million ways, I really
hate to send HTML or even RTF email to a list. Ping me for a clean copy if
you want one.






More information about the argus mailing list