Feature request: grep hex strings with -e
Dave Edelman
dedelman at iname.com
Fri Oct 12 17:26:03 EDT 2012
The PCRE library is available and it comes with several wrappers, one of
which is a drop in replacement for the POSIX tools.
Off-list, I exchanged emails with Carter and as an experiment, I actually
had very little problem getting this to work with the latest set of clients.
There are a few things that took a bit of adjustment and there is one thing
that requires more work, but examples show it best.
I am using ra(1) from argus-clients-3.0.7.3 but all the clients that support
-e use the same code. The version of PCRE is 8.31 I have this running on
FC14 and FC17, I will try it on the Mac over the weekend.
The simple case was a test for nothing getting broken, Finding a plain ASCII
string was no problem
ra -M printer="hex" -r argus.out -s suser:60 -e 'DevMgmt'
0x0000 4745 5420 2f44 6576 4d67 6d74 2f50 726f
GET./DevMgmt/Pro
0x0010 6475 6374 5374 6174 7573 4479 6e2e 786d
ductStatusDyn.xm
0x0020 6c20 4854 5450 2f31 2e31 0d0a 486f 7374
l.HTTP/1.1..Host
0x0030 3a20 6c6f 6361 6c68 6f73 740d :.localhost.
Checking how well it did looking for escaped hex values was next. Again no
problem.
ra -M printer="hex" -r argus.out -s suser:60 -e 'DevMgmt.*\x78\x6d'
0x0000 4745 5420 2f44 6576 4d67 6d74 2f50 726f
GET./DevMgmt/Pro
0x0010 6475 6374 5374 6174 7573 4479 6e2e 786d
ductStatusDyn.xm
0x0020 6c20 4854 5450 2f31 2e31 0d0a 486f 7374
l.HTTP/1.1..Host
0x0030 3a20 6c6f 6361 6c68 6f73 740d :.localhost.
Getting a bit more ambitious, attempted to cross the 7 to 8 bit value line
and that wasn't so good:
This is in the file and it includes hex values greater than 7F
0x0000 230f 06fb 0000 0000 0000 0000 0000 0000
#...............
0x0010 0000 0000 0000 0000 0000 0000 0000 0000
................
0x0020 0000 0000 0000 0000 d422 b1ce 9a1d a800
........."......
ra -M printer="hex" -r argus.out -s suser:60 -e
'\xd4\x22\xb1\xce\x9a\x1d\xa8\x00'
The search fails.
A bit of RTFM and I learn that when you build PCRE, you must use the
--enable-utf option for ./configure By this time my test file was archived
and I had to look for something else.
ra -M printer="hex" -r argus.out -s suser:1000 -e '\xff.*\x98'
0x0000 b1f0 9bdb 4ab3 0c83 5980 8531 0271 c099
....J...Y..1.q..
0x0010 109c ff33 3bb8 db57 dad4 ff77 541d 9eec
...3;..W...wT...
0x0020 9e14 9027 263d c6aa 4c73 95cb 670b 29f0
...'&=..Ls..g.).
0x0030 066e e3c5 d3b5 5083 a945 317c e026 cb7d
.n....P..E1|.&.}
0x0040 d88f bbfb 75fb e1b9 2ef3 4ab9 01a0 b4bc
....u.....J.....
0x0050 7063 0f20 4476 0c9f b231 08b6 9d74 aecb
pc..Dv...1...t..
0x0060 2d09 69ed b4b5 6956 6e65 e9d7 5e3c 764f
-.i...iVne..^<vO
0x0070 32fd 6961 0ecb 1e2c 16ad 38f9 4a95 86f9
2.ia...,..8.J...
0x0080 8815 d697 d19f 3a49 df52 829c 98c5 adb9
......:I.R......
0x0090 0278 5bf0 8859 5de7 322e 18dc 7167 df9b
.x[..Y].2...qg..
0x00a0 4fd0 308f f52c 8c22 68fe 748f 0f63 d215
O.0..,."h.t..c..
0x00b0 9681 664b 4ecc bef6 f18e cc15 8dce 4caf
..fKN.........L.
0x00c0 e20f 8d07 7c67 31db 2759 f360 596e 1194
....|g1.'Y.`Yn..
0x00d0 1e2c 6909 b003 731e 6909 eedb 3bd6 92fe
.,i...s.i...;...
0x00e0 8a18 6cd8 800c 8643 25af 2655 6c4f f0c9
..l....C%.&UlO..
0x00f0 ba76 b638 601e 2787 17a6 69d5 a773 5894
.v.8`.'...i..sX.
0x0100 302a 4085 a1d8 4153 31e0 358a 0b9a 2f65
0*@...AS1.5.../e
0x0110 933b 4bad ae63 d0a1 e7d6 70bf 2df8 15c2
.;K..c....p.-...
0x0120 fcc5 fa4b 4ed3 5d56 aff0 424f 6e71 5901
...KN.]V..BOnqY.
0x0130 8368 e4f7 1b3f f7db 31d3 8c0e ccdd d734
.h...?..1......4
0x0140 893b 2c0b 0b14 356b 315e e9bb d538 d013
.;,...5k1^...8..
0x0150 8f8c d328 65ec 488a f516 abe3 9e2a 82c2
...(e.H......*..
0x0160 9460 e36f 1177 3609 e6bd f241 1a4b 3571
.`.o.w6....A.K5q
0x0170 f7eb 1fc0 f938 28ac ac3e 56de b576 971e
.....8(..>V..v..
0x0180 46a0 d101 c052 b98a 1fc2 0c0b 56af 71b1
F....R......V.q.
0x0190 e8ac 1bc0 f6f5 42b0 a988 ce50 ff76 13b8
......B....P.v..
0x01a0 e2cf 436e e674 651b a587 b22f ae4d 0689
..Cn.te..../.M..
0x01b0 0788 da56 960e 38a7 bd3c 0d24 63ef 7c61
...V..8..<.$c.|a
Now that it works, I wanted to see something a bit more complex. The regex
in this case is a very accurate IPv4 address recognizer for a specific class
of addresses. In this case I was also testing multi-line searches. The
CACHE-CONTROL request header is separated from the IP address in the
LOCATION header by a \x0d\x0a indicating a new line. The regexp uses
alternation, grouping without capture, repeat counts, escaped characters and
more than a bit of stamina :)
ra -r anonargus.2012.10.12.17.00.01.0.gz -M printer="hex" -s + suser:1000\
-e
'CACHE.*(?:(?:1[0-9][0-9]\.)|(?:[1-9][0-9]\.)|(?:[1-9]\.)|(?:20[0-9]|21[0-9]
|22[0-3]\.))(?:(?:1[0-9][0-9]\.)|(?:[1-9][0-9]\.)|(?:[0-9]\.)|(?:25[0-5])|(?
:2[0-4][0-9]\.)){2}
(?:(?:1[0-9][0-9])|(?:[1-9][0-9])|(?:[1-9])|(?:25[0-4]|2[0-4][0-9]))'
Fri 2012-10-12 17:59:36.615097 e udp
1.0.2.13.23403 -> 224.0.2.1.39787
1 428 INT
0x0000 4e4f 5449 4659 202a 2048 5454 502f 312e
NOTIFY.*.HTTP/1.
0x0010 310d 0a48 4f53 543a 2032 3339 2e32 3535
1..HOST:.239.255
0x0020 2e32 3535 2e32 3530 3a31 3930 300d 0a43
.255.250:1900..C
0x0030 4143 4845 2d43 4f4e 5452 4f4c 3a20 6d61
ACHE-CONTROL:.ma
0x0040 782d 6167 653d 3130 300d 0a4c 4f43 4154
x-age=100..LOCAT
0x0050 494f 4e3a 2068 7474 703a 2f2f 3136 392e
ION:.http://169.
0x0060 3235 342e 362e 3636 3a34 3931 3532 2f6e
254.6.66:49152/n
0x0070 6173 6465 7669 6365 2e78 6d6c 0d0a 4e54
asdevice.xml..NT
0x0080 3a20 7572 6e3a 7363 6865 6d61 732d 6d69
:.urn:schemas-mi
0x0090 6372 6f73 6f66 742d 636f 6d3a 7365 7276
crosoft-com:serv
0x00a0 6963 653a 4e55 4c4c 3a31 0d0a 4e54 533a
ice:NULL:1..NTS:
0x00b0 2073 7364 703a 616c 6976 650d 0a53 4552
.ssdp:alive..SER
0x00c0 5645 523a 204c 696e 7578 2f32 2e36 2e33
VER:.Linux/2.6.3
0x00d0 322e 3131 2d73 766e 3136 3630 362c 2055
2.11-svn16606,.U
0x00e0 506e 502f 312e 302c 2050 6f72 7461 626c
PnP/1.0,.Portabl
0x00f0 6520 5344 4b20 666f 7220 5550 6e50 2064
e.SDK.for.UPnP.d
0x0100 6576 6963 6573 2f31 2e36 2e36 0d0a 582d
evices/1.6.6..X-
0x0110 5573 6572 2d41 6765 6e74 3a20 7265 6473
User-Agent:.reds
0x0120 6f6e 6963 0d0a 5553 4e3a 2075 7569 643a
onic..USN:.uuid:
0x0130 3733 3635 3637 3631 2d37 3436 352d 3733
73656761-7465-73
0x0140 3735 2d36 3336 622d 3030 3930 6139 6636
75-636b-0090a9f6
0x0150 3263 3239 3a3a 7572 6e3a 7363 6865 6d61
2c29::urn:schema
0x0160 732d 6d69 6372 6f73 6f66 742d 636f 6d3a
s-microsoft-com:
0x0170 7365 7276 6963 653a 4e55 4c4c 3a31 0d0a
service:NULL:1..
0x0180 0d0a ..
And now the bit that still needs some work. The PCRE POSIX wrapper uses
strlen() and null values in the data stream are not a good thing. I've
looked at the wrapper code and I've looked at the code for argus_grep.c and
given a choice, replacing the posix wrapper calls in argus_grep.c with real
PCRE calls seems to be a better idea than changing the wrapper. This does
present a problem for people who don't have PCRE and who do not wish to
install it. I don't think it makes sense to create both alternatives in a
single file with preprocessor conditionals. Does it make sense to leave
argus_grep.c (including the enhanced option for the Macintosh users) and
create argus_pcre.c in parallel providing a build option to select one or
the other? That would make all of the PCRE options available. The
alternative is being able to search up to the first NULL and nothing more:
ra -r anonargus.2012.10.12.17.00.01.0.gz -M printer="hex" -s + suser:100 -e
'\xe4\x05\x40'
StartTime Flgs Proto TcpOpt
SrcAddr Sport Dir DstAddr Dport
TotPkts TotBytes
State State
Fri 2012-10-12 17:59:06.466956 e udp
1.0.2.12.netbios-ns -> 1.0.2.11.netbios-ns
3 330
REQ
0x0000 e405 4000 0001 0000 0000 0001 2046 4845
.. at ..........FHE
0x0010 5046 4345 4c45 4846 4345 5046 4646 4143
PFCELEHFCEPFFFAC
0x0020 4143 4143 4143 4143 4143 4141 4100 0020
ACACACACACAAA...
0x0030 0001 c00c 0020 0001 0004 93e0 0006 e000
................
0x0040 0a01 011f e405 4000 0001 0000 0000 0001
...... at .........
0x0050 2046 4845 5046 4345 4c45 4846 4345 5046
.FHEPFCELEHFCEPF
0x0060 4646 4143
FFAC
[root at rodnel-new anondata]# ra -r anonargus.2012.10.12.17.00.01.0.gz -M
printer="hex" -s + suser:100 -e '\xe4\x05\x40\x00'
Nothing found.
Thoughts anyone?
--Dave
Apologies to everyone seeing this line wrapped in a million ways, I really
hate to send HTML or even RTF email to a list. Ping me for a clean copy if
you want one.
More information about the argus
mailing list