Duplicate packets

Wed Oct 9 12:04:06 EDT 2013

Retransmissions have a new IP-id, since they are new packets.
Po != Pr

We seem to have different views on what a dupe is. I have thought of it as 
an 100% identical packet, same VLAN, same MPLS, same TTL, same IP-id, 
same L2-header, etc.

The question is then what argus consider a packet. Is it the whole 
ethernet frame (as I think it is), or could it be just the part above the 
L2-header?
I understand that the logic can be expensive with your definition of dupe. 
:-)

1. the traffic path goes past the same observation point multiple times
      the *same* packet goes by multiple times.
      This is not the same as monitoring a one-legged router where we see
      an incoming packet, and then see it again after it was routed. The
      routed packet is a "new" packet with an updated TTL, i.e. Po != Pr.

In my world, scenario #3 is the most common one. Faulty SPAN setup that 
generate doubled traffic. This is easy to manually spot if *all* of the 
traffic is duplicated, but sometimes there's a mix, where some 
networks/vlans are mirrored fine (one copy in each direction) while 
another are duplicated. In these cases, a spot test can easily miss the 
bad SPAN configuration. That's my main reason why I want argus to 
handle dupes.

/Elof

On Wed, 9 Oct 2013, Carter Bullard wrote:
> There is a lot more to this than your response would indicate.
>
> There isn't anything in a packet that distinguishes a
> retransmission from the original.
>
> Same can apply to dups, but
> generally dups are different.  They have different L2 identifiers,
> or they are (or can be) in different VLANs, or they are in
> different MPLS or GRE tunnels, etc...
>
> The content of a packet retransmission is identical in every
> way to the original packet.  As long as the network treats the
> original and the retransmission in the same way (path, priority),
> the Po will be identical to Pr.  Po == Pr where Po is the
> original packet and Pr is the retransmitted packet.
>
> Retransmissions occur only because the original sender decides to
> send a packet again.  For protocols like TCP, this requires a full
> round-trip time to occur before the sender can realize that the
> packet didn't get to the far side.  So the time between the original
> packet and the retransmission must be greater than the round-trip time
> of the network connection.
>
> Dups, however, generally appear due to 3 reasons.
>
>   1. the traffic path goes past the same observation point multiple times
>        the same packet goes by multiple times.
>
>   2. the network duplicates a packet, so for reliability or multicasting..
>        two or more copies of the same packet exist in the network at the same time
>
>   3. the collection infrastructure generates multiple copies of a single packet
>        one packet in the network, but port mirroring generates multiple copies
>
> In some situations, its easy to distinguish the dups, especially in case 1.
> The IP time to live field may have changed if a router is involved, or
> new source and/or destination ethernet addresses are in the header, or the
> packet is on the same wire twice, but in different services, like VLANs
> or tunnels.  Argus can discriminate these types of duplicates, through
> modification of the flow keys (5-TUPLE+L2+VLAN+MPLS).
>
> Well anyway, this is just the start of the description.  It can be much
> more complicated that this.
>
>
> Now with regard to gaps…. Gaps are where argus doesn't see all the packets
> in a flow.  This happens when there is loss in the collection system,
> packet was on the wire, but it didn't get to argus for some reason,
> OR when there is stripping, or load balancing and your argus only sees
> 50%, 33%, or 25% of the packets in a flow.  TCP indicates that there
> were 10000 bytes transferred, but you only observed 5000 bytes.
>
> This is important, and we get it for free, because we're trying to
> figure out the loss rate.
>
> Carter
>
>
> On Oct 9, 2013, at 10:20 AM, elof2 at sentor.se wrote:
>
>> On Tue, 1 Oct 2013, Carter Bullard wrote:
>>
>>> Well, hmmmmmmm…. Everyone else wants to do de-duping of the packet stream.
>>> Why would you want to be different from everyone else ?????   ;O)
>>
>> I'm curious. What exactly is it that everyone wants (or not wants)?
>>
>>
>>
>>> The strategy is to differentiate loss, retrans and dups, and report
>>> them as independent metrics, with loss being observable loss, retrans being observable duplicates, and dups (for TCP) being retrans arriving in less than an RTT.
>>
>> I don't fully agree about the dups.
>> A dupe is, in my opinion, an *exact* copy of the original packet.
>> A retransmission is not a dupe, it is a new packet, crafted because the original supposedly got lost.
>>
>> Therefore, the logic need not be so expensive. If the very next packet is identical to the last packet and it was received within a microsecond from the last packet then it is a dupe.
>>
>> Taking the RTT into consideration seem a bit excessive for the simple task of dupe recognition. Also, a RTT is not always possible to calulate if the flow only consist of a single SYN, or is just unidirectional UDP traffic, etc, but the packets of the flow are still duplicated.
>>
>>
>>
>>
>>> We still will need to derive gaps, which are lost packets that were not
>> retransmitted.
>>
>> Oooh, are you talking about distinguishing between external loss and internal loss?
>> When argus see a gap in tcp sequence numbers you know there has been a drop, but not where it occurred.
>> If argus then see a tcp retransmission for that gap, we know the drop was external, otherwise it was probably internal.
>> That kind of logic seem expensive. If it *is* really expensive, I would say don't do it. Only do the first gap detection (as you already do today) and leave the task of understanding where the drops occurr to the user.
>>
>> /Elof
>>
>>
>
>