Detect packet drops

elof2 at sentor.se elof2 at sentor.se
Thu Jan 26 05:37:27 EST 2012


On Wed, 25 Jan 2012, Peter Van Epp wrote:
> On Wed, Jan 25, 2012 at 02:02:08PM +0100, elof2 at sentor.se wrote:
>> Any more thoughts or progress with this?
>>
>> I just realised that I can't even rely on Wireshark for an estimate
>> of dropped packets, since Wireshark's Expert Info "ACKed lost
>> segment" tag out-of-order FIN-packets as "ACKed lost segment".
>>
>> What I'm looking for is not a 100% accurate system to count every
>> missing packet (which is impossible to determine), but a flag on
>> each session that argus know is missing one or more packets.
>> Just like the flag for retransmission doesn't say how many
>> retransmissions there were in a tcp flow.
>
> 	Checking the pcap reported loss rate (its in the man records which
> you have to enable to see these days) will give you an indication, although
> it is only one of the several ways your sensor can be losing packets, is one
> good indication of how your sensor is doing. There is an explaination of a
> number of the possible (and usually invisible) loss points in a sensor on
> Carter's web site at http://www.qosient.com/argus/sensorPerformance.shtml as
> well.


Hi Peter.
Thanks for your input.

Ah, didn't know about the hidden pcap drop counters. I will take a look at 
it.

However... Even though I can see the pcap drop count, I still think it 
would be nice if argus could tag individual flows where it has detected gaps.
The tag would give us argus users a notification that not all traffic is 
monitored 100%. An informative tag just like the out-of-order tag or ECN 
tag.

I now realise that my suggestion of having tags like "dropped externally" 
and "dropped internally" is not feasable, since there's no way to 
correlate the pcap drop counter to specific flows, so ignore this.

Apart from simply being informed that the monitored traffic is not 100%, I 
would also very much like to be able to determine if the drops occur 
outside of the sensor, i.e. the switch drop lots of packets while the 
sensor drop nothing.
With the tag above, and a pcap-drop-counter in the argus man-records it 
should be easier to spot that external drops occur.
(naturally, if you have both external drops and internal drops, it will be 
hard to investigate, but that's always the case. If I'm sure I have 0 
drops within my sniffing machine, then all flows tagged with gaps must be 
due to drops in the external switch or tap (or faulty DAG/DAC drivers 
that doesn't report their own drop count, but that is a completely 
different matter).


> 	Comparing the RMON traffic counts reported by the switch feeding your
> sensor against the argus counts is another way although syncronizing the two
> counts can be exciting :-). Both of these only indicate loss of data that makes
> it as far as your sensor of course and isn't an indication of loss else where
> in the path but thats a start ...

Hehe, this is not possible since in many cases the SPAN port is not 
managed by me. I just manage the sensor receiving the mirrored traffic, 
but it is someone else who has setup the SPAN configuration.
So diffing the reported drop-numbers is practically not feasable.

> 	As well using something like tcpreplay from a pcap file with suitable
> hardware (which can get very hard at high speed of course :-)) feeding in to
> your sensor can give you a known input traffic pattern to estimate sensor loss
> as well.

Now you're rather talking about detecting local sensor loss. What I'm 
primarily asking for is a way to easily detect that there are external 
packet loss.

Currently I'm sniffing e.g. 100 000 packets with tcpdump, making sure 
nothing is dropped locally. In this case it took 3 seconds to gather 100 
000 packets. I scp the pcap file to a machine running Wireshark. I open up 
the "Expert Info Composite" and look at "ACKed lost segment" and "Previous 
segment lost".
In an environment where the traffic is mirrored correctly, these two 
counters give me an estimate as to how many gaps there are in the tcp 
flows in the pcap file (disregarding a couple of false positives at capture 
startup).
...that is, I can see if the people feeding me mirrored traffic have 
problems in their end.

This procedure is quite tiresome. Also, it is unreliable when the 
mirrored packets are received out of order (common in 
redundant/loadbalanced environments), then Wireshark will tag packets as 
lost even though they exist.

/Elof



More information about the argus mailing list