Problem with byte-swapped IP addresses

Mon Mar 8 21:57:02 EST 2010

On Mon, Mar 08, 2010 at 04:03:39PM +0100, Martijn van Oosterhout wrote:
> On Mon, Mar 8, 2010 at 4:29 AM, Peter Van Epp <vanepp at sfu.ca> wrote:
> >        I suspect we will find this is high load related in which case turning
> > off pf-ring likely causes enough packet loss that the load doesn't tickle
> > the bug :-). Have you given pf-ring lots of memory for buffers? As I recall
> > (its been a number of years since I last played with pf-ring) we had to boost
> > the kernel VM space limit to be able to get a couple of megs of buffer space
> > for pf-ring although that may have been a 2.4 series kernel rather than 2.6.
> 
> That's an idea. Unfortunately I don't see a simple way to determine if
> argus is dropping many packets or not. We've configured argus to have
> a ring-buffer of 16384 packets. You can make it bigger, but if argus
> isn't keeping up then it doesn't really matter how big you make it.
> argus isn't using 100% CPU, but maybe that's a lie.

	Check syslog for complaints from argus. If it is having trouble keeping
up (at least in the output task) it will complain about queue sizes to syslog.

> 
> In any case, I have a comparison between tcpdump and argus. Here is
> the output that argus dumped:
> 
> 14:22:43.610709 IP 192.168.20.2.139 > 192.168.7.168.1614: .
> 2854318090:2854319550(1460) ack 1379121316 win 64938
> 14:22:43.610744 IP 192.168.15.23.1126 > 192.168.20.121.139: P
> 2087784273:2087784391(118) ack 2513133511 win 65391
> 14:22:43.610762 IP 192.168.7.168.1614 > 192.168.20.2.139: . ack
> 4294954156 win 17520
> 14:22:43.610790 IP 192.168.20.2.139 > 192.168.7.168.1614: .
> 1460:2920(1460) ack 1 win 64938
> 14:22:43.702911 00:50:56:93:2a:15 Null > 00:0f:1f:66:11:bc Unknown
> DSAP 0x44 Information, send seq 85, rcv seq 0, Flags [Command], length
> 170
> 14:22:43.702936 IP truncated-ip - 54825 bytes missing! 43.20.168.192 >
> 46.20.168.192: tcp
> 14:22:43.702959 IP 192.168.20.43.110 > 192.168.20.120.4720: .
> 1721333935:1721335395(1460) ack 3142743201 win 64675
> 14:22:43.702980 IP 192.168.20.43.110 > 192.168.20.120.4720: .
> 1460:2920(1460) ack 1 win 64675
> 14:22:43.703000 IP 192.168.20.43.110 > 192.168.20.120.4720: .
> 2920:4380(1460) ack 1 win 64675

	This suggests to me that pf-ring over ran the circular buffer with
bad results. It looks to perhaps have gone back to the start (or screwed up
its pointers) and lost a buffer full of data, probably because argus wasn't
getting it out quick enough. What snap length are you using? Are you collecting
user data or just header info? If you are collecting user data backing down
to just header info (128 bytes or so) may help by reducing the load.

> 
> Note the jump in timestamps at the same moment. This is what tcpdump saw:
> 
> 14:22:43.610709 IP 192.168.20.2.139 > 192.168.7.168.1614: .
> 2854318090:2854319550(1460) ack 1379121316 win 64938
> 14:22:43.610744 IP 192.168.15.23.1126 > 192.168.20.121.139: P
> 2087784273:2087784391(118) ack 2513133511 win 65391
> 14:22:43.610762 IP 192.168.7.168.1614 > 192.168.20.2.139: . ack
> 4294954156 win 17520
> 14:22:43.610790 IP 192.168.20.2.139 > 192.168.7.168.1614: .
> 1460:2920(1460) ack 1 win 64938
> 
> ... 3846 other packets ...
> 
> 14:22:43.702911 IP 192.168.20.120.1287 > 192.168.20.95.139: P
> 2739085214:2739085344(130) ack 4097980446 win 63368
> 14:22:43.702936 IP 192.168.20.43.110 > 192.168.20.120.4720: .
> 1721332475:1721333935(1460) ack 3142743201 win 64675
> 14:22:43.702959 IP 192.168.20.43.110 > 192.168.20.120.4720: .
> 1460:2920(1460) ack 1 win 64675
> 14:22:43.702980 IP 192.168.20.43.110 > 192.168.20.120.4720: .
> 2920:4380(1460) ack 1 win 64675
> 14:22:43.703000 IP 192.168.20.43.110 > 192.168.20.120.4720: .
> 4380:5840(1460) ack 1 win 64675
> 
> So basically, at the moment the byte-swapping happens, there's also a
> huge amount of data lost (92ms worth).
> 
> The first corrupted packet:
> 
> 14:22:43.702911 00:50:56:93:2a:15 Null > 00:0f:1f:66:11:bc Unknown
> DSAP 0x44 Information, send seq 85, rcv seq 0, Flags [Command], length
> 170
>         0x0000:  000f 1f66 11bc 0050 5693 2a15 0008 4500  ...f...PV.*...E.
>         0x0010:  aa00 57d0 0040 8006 7fce 7814 a8c0 5f14  ..W.. at ....x..._.
>         0x0020:  a8c0 0507 008b a343 1f9e f442 381e 5018  .......C...B8.P.
>         0x0030:  f788 6006 0000 0000 007e                 ..`......~
> 
> Has byte-swappped: ethernet type, IP total length, fragment offset, src&dst IP,
> 
> The second corrupted packet:
> 
> 14:22:43.702936 IP truncated-ip - 54825 bytes missing! 43.20.168.192 >
> 46.20.168.192: tcp
>         0x0000:  0050 5693 2a15 001a a02c b60d 0800 4500  .PV.*....,....E.
>         0x0010:  dc05 372d 0040 8006 1df1 2b14 a8c0 2e14  ..7-. at ....+.....
>         0x0020:  a8c0 006e 1270 6699 72fb bb52 74a1 5010  ...n.pf.r..Rt.P.
>         0x0030:  fca3 9627 0000 7654 6c46                 ...'..vTlF
> 
> is the same except for the ethernet type, which is OK. The TCP headers
> are not swapped in either. Frankly, I'm totally confused.

	If the problem is the buffer being overwritten it may be that the 
kernel is filling the packet with new data (and thus some non swapped and
some swapped data) at the same time argus ir reading that data. The result
will be very confused as the data will be changing even as the argus is 
writing the buffer out to the file and decoding the headers (which will be
changing under it even as it tries to decode them likely).

> 
> I think there are options in pcap-ringbuffer to dump extra tracing
> info, so I'm going to check that. Finally, is it easy to disabled the
> inplace switching, just to see if that make a difference (sure,
> everything will be backwards, but if it's all consistently backwards
> we've found the culprit)?

	I suspect that won't be that easy but will leave it for Carter to 
answer definitively. It may be easier to snag a copy of the pcap buffer
and write it to the argus pcap file early (if that isn't already done) and
see if the contents change when the normal write to the file occurs.
> 
> Thanks in advance,
> -- 
> Martijn van Oosterhout <kleptog at gmail.com> http://svana.org/kleptog/

     Some more configuration questions: is the argus machine also archiving
the data to disk locally or is it writing to a socket (with no or very little
other than update local disk traffic)? At high volumes (about 30 to 50 megabits
per second on older machines, don't know about current gen) packet loss occurs
if you are writing to local disk so the two machine layout works better. 

Peter Van Epp