Detect packet drops
elof2 at sentor.se
elof2 at sentor.se
Thu Jan 26 05:37:27 EST 2012
On Wed, 25 Jan 2012, Peter Van Epp wrote:
> On Wed, Jan 25, 2012 at 02:02:08PM +0100, elof2 at sentor.se wrote:
>> Any more thoughts or progress with this?
>>
>> I just realised that I can't even rely on Wireshark for an estimate
>> of dropped packets, since Wireshark's Expert Info "ACKed lost
>> segment" tag out-of-order FIN-packets as "ACKed lost segment".
>>
>> What I'm looking for is not a 100% accurate system to count every
>> missing packet (which is impossible to determine), but a flag on
>> each session that argus know is missing one or more packets.
>> Just like the flag for retransmission doesn't say how many
>> retransmissions there were in a tcp flow.
>
> Checking the pcap reported loss rate (its in the man records which
> you have to enable to see these days) will give you an indication, although
> it is only one of the several ways your sensor can be losing packets, is one
> good indication of how your sensor is doing. There is an explaination of a
> number of the possible (and usually invisible) loss points in a sensor on
> Carter's web site at http://www.qosient.com/argus/sensorPerformance.shtml as
> well.
Hi Peter.
Thanks for your input.
Ah, didn't know about the hidden pcap drop counters. I will take a look at
it.
However... Even though I can see the pcap drop count, I still think it
would be nice if argus could tag individual flows where it has detected gaps.
The tag would give us argus users a notification that not all traffic is
monitored 100%. An informative tag just like the out-of-order tag or ECN
tag.
I now realise that my suggestion of having tags like "dropped externally"
and "dropped internally" is not feasable, since there's no way to
correlate the pcap drop counter to specific flows, so ignore this.
Apart from simply being informed that the monitored traffic is not 100%, I
would also very much like to be able to determine if the drops occur
outside of the sensor, i.e. the switch drop lots of packets while the
sensor drop nothing.
With the tag above, and a pcap-drop-counter in the argus man-records it
should be easier to spot that external drops occur.
(naturally, if you have both external drops and internal drops, it will be
hard to investigate, but that's always the case. If I'm sure I have 0
drops within my sniffing machine, then all flows tagged with gaps must be
due to drops in the external switch or tap (or faulty DAG/DAC drivers
that doesn't report their own drop count, but that is a completely
different matter).
> Comparing the RMON traffic counts reported by the switch feeding your
> sensor against the argus counts is another way although syncronizing the two
> counts can be exciting :-). Both of these only indicate loss of data that makes
> it as far as your sensor of course and isn't an indication of loss else where
> in the path but thats a start ...
Hehe, this is not possible since in many cases the SPAN port is not
managed by me. I just manage the sensor receiving the mirrored traffic,
but it is someone else who has setup the SPAN configuration.
So diffing the reported drop-numbers is practically not feasable.
> As well using something like tcpreplay from a pcap file with suitable
> hardware (which can get very hard at high speed of course :-)) feeding in to
> your sensor can give you a known input traffic pattern to estimate sensor loss
> as well.
Now you're rather talking about detecting local sensor loss. What I'm
primarily asking for is a way to easily detect that there are external
packet loss.
Currently I'm sniffing e.g. 100 000 packets with tcpdump, making sure
nothing is dropped locally. In this case it took 3 seconds to gather 100
000 packets. I scp the pcap file to a machine running Wireshark. I open up
the "Expert Info Composite" and look at "ACKed lost segment" and "Previous
segment lost".
In an environment where the traffic is mirrored correctly, these two
counters give me an estimate as to how many gaps there are in the tcp
flows in the pcap file (disregarding a couple of false positives at capture
startup).
...that is, I can see if the people feeding me mirrored traffic have
problems in their end.
This procedure is quite tiresome. Also, it is unreliable when the
mirrored packets are received out of order (common in
redundant/loadbalanced environments), then Wireshark will tag packets as
lost even though they exist.
/Elof
More information about the argus
mailing list