Duplicate packets

Wed Oct 9 12:54:02 EDT 2013

On Oct 9, 2013, at 12:04 PM, elof2 at sentor.se wrote:

> 
> Retransmissions have a new IP-id, since they are new packets.
> Po != Pr
> 

Hmmmmm, well, yes and no.  Many kernels set ipid to zero, except when
there are fragments.  So ipid can't be used in a general algorithm.
Now, we can use it if its there, but what do you do when its not ????

So the problem of generic 5-tuple flow modelers is that, by definition,
you only have L3/L4 identifiers to identify network activity.
Which means that for the purposes of a flow monitor, the IP header
and Transport headers are the only thing in the packet.

Argus is/has always been different, because we've identified that you need
more information to understand what is really going on.

The WAN guys have recognized for a while that many dups are really
"flow collisions", where the 5-tuple is the same, but the context of
the packets is different.  In some cases, the flows are the same flow,
but in some cases, they are different customers, using the same IP
addresses, but in different MPLS tunnels.

Argus's 5-tuple is not what comes after L2, argus uses the uppermost
5-tuple as the key, as that is the best we can do to find the 
end-to-end flow descriptor.  Argus does have, however, most of the
underlying tunnel identifiers available, the local L2 identifiers,
the next level tunnel id's, etc...if you want to add to the 5-tuple
key.

> 
> 
> We seem to have different views on what a dupe is. I have thought of it as an 100% identical packet, same VLAN, same MPLS, same TTL, same IP-id, same L2-header, etc.
> 
> The question is then what argus consider a packet. Is it the whole ethernet frame (as I think it is), or could it be just the part above the L2-header?
> I understand that the logic can be expensive with your definition of dupe. :-)
> 
> 
> 
> 1. the traffic path goes past the same observation point multiple times
>     the *same* packet goes by multiple times.
>     This is not the same as monitoring a one-legged router where we see
>     an incoming packet, and then see it again after it was routed. The
>     routed packet is a "new" packet with an updated TTL, i.e. Po != Pr.

Well TTL is different only if a router processed the packet.  For tunneled
traffic, or switched traffic, TTL stays the same.

So, we'll have to store a lot of data per flow, and update that data on
each packet, to be able to make your identical packet test.  Pretty
expensive to test something that you shouldn't ever have to test.

> 
> 
> In my world, scenario #3 is the most common one. Faulty SPAN setup that generate doubled traffic. This is easy to manually spot if *all* of the traffic is duplicated, but sometimes there's a mix, where some networks/vlans are mirrored fine (one copy in each direction) while another are duplicated. In these cases, a spot test can easily miss the bad SPAN configuration. That's my main reason why I want argus to handle dupes.

So, lets discuss your situation, where one of the VLAN mirrors is screwed up, and
lets imagine that its messed up in only one direction, and its a Tivo DVR, so ipid's
are not available, and the mirror device is a switch, so no TTL changes.  

We need an algorithm that at least describes what is on
the wire, so the ra* clients can figure out that there is a bad VLAN
mirror.  You don't want argus to make that call (now that would be very
complicated).

The goal is to not over count flow metrics, to generate data that
reflects what is really going on on the wire.  So we need to be able
to have a real correction mechanism that doesn't skew the data.

Not sure that duplicate packets in the 1 millisecond time frame is good
enough.  RTT can be shorter than 1 mSec in small workgroups, and so
legitimate retransmissions may trip up something.  RTT is good, in this
case, as it gives you a real value for the test.

Carter

> 
> /Elof
> 
> 
> On Wed, 9 Oct 2013, Carter Bullard wrote:
>> There is a lot more to this than your response would indicate.
>> 
>> There isn't anything in a packet that distinguishes a
>> retransmission from the original.
>> 
>> Same can apply to dups, but
>> generally dups are different.  They have different L2 identifiers,
>> or they are (or can be) in different VLANs, or they are in
>> different MPLS or GRE tunnels, etc...
>> 
>> The content of a packet retransmission is identical in every
>> way to the original packet.  As long as the network treats the
>> original and the retransmission in the same way (path, priority),
>> the Po will be identical to Pr.  Po == Pr where Po is the
>> original packet and Pr is the retransmitted packet.
>> 
>> Retransmissions occur only because the original sender decides to
>> send a packet again.  For protocols like TCP, this requires a full
>> round-trip time to occur before the sender can realize that the
>> packet didn't get to the far side.  So the time between the original
>> packet and the retransmission must be greater than the round-trip time
>> of the network connection.
>> 
>> Dups, however, generally appear due to 3 reasons.
>> 
>>  1. the traffic path goes past the same observation point multiple times
>>       the same packet goes by multiple times.
>> 
>>  2. the network duplicates a packet, so for reliability or multicasting..
>>       two or more copies of the same packet exist in the network at the same time
>> 
>>  3. the collection infrastructure generates multiple copies of a single packet
>>       one packet in the network, but port mirroring generates multiple copies
>> 
>> In some situations, its easy to distinguish the dups, especially in case 1.
>> The IP time to live field may have changed if a router is involved, or
>> new source and/or destination ethernet addresses are in the header, or the
>> packet is on the same wire twice, but in different services, like VLANs
>> or tunnels.  Argus can discriminate these types of duplicates, through
>> modification of the flow keys (5-TUPLE+L2+VLAN+MPLS).
>> 
>> Well anyway, this is just the start of the description.  It can be much
>> more complicated that this.
>> 
>> 
>> Now with regard to gaps…. Gaps are where argus doesn't see all the packets
>> in a flow.  This happens when there is loss in the collection system,
>> packet was on the wire, but it didn't get to argus for some reason,
>> OR when there is stripping, or load balancing and your argus only sees
>> 50%, 33%, or 25% of the packets in a flow.  TCP indicates that there
>> were 10000 bytes transferred, but you only observed 5000 bytes.
>> 
>> This is important, and we get it for free, because we're trying to
>> figure out the loss rate.
>> 
>> Carter
>> 
>> 
>> On Oct 9, 2013, at 10:20 AM, elof2 at sentor.se wrote:
>> 
>>> On Tue, 1 Oct 2013, Carter Bullard wrote:
>>> 
>>>> Well, hmmmmmmm…. Everyone else wants to do de-duping of the packet stream.
>>>> Why would you want to be different from everyone else ?????   ;O)
>>> 
>>> I'm curious. What exactly is it that everyone wants (or not wants)?
>>> 
>>> 
>>> 
>>>> The strategy is to differentiate loss, retrans and dups, and report
>>>> them as independent metrics, with loss being observable loss, retrans being observable duplicates, and dups (for TCP) being retrans arriving in less than an RTT.
>>> 
>>> I don't fully agree about the dups.
>>> A dupe is, in my opinion, an *exact* copy of the original packet.
>>> A retransmission is not a dupe, it is a new packet, crafted because the original supposedly got lost.
>>> 
>>> Therefore, the logic need not be so expensive. If the very next packet is identical to the last packet and it was received within a microsecond from the last packet then it is a dupe.
>>> 
>>> Taking the RTT into consideration seem a bit excessive for the simple task of dupe recognition. Also, a RTT is not always possible to calulate if the flow only consist of a single SYN, or is just unidirectional UDP traffic, etc, but the packets of the flow are still duplicated.
>>> 
>>> 
>>> 
>>> 
>>>> We still will need to derive gaps, which are lost packets that were not
>>> retransmitted.
>>> 
>>> Oooh, are you talking about distinguishing between external loss and internal loss?
>>> When argus see a gap in tcp sequence numbers you know there has been a drop, but not where it occurred.
>>> If argus then see a tcp retransmission for that gap, we know the drop was external, otherwise it was probably internal.
>>> That kind of logic seem expensive. If it *is* really expensive, I would say don't do it. Only do the first gap detection (as you already do today) and leave the task of understanding where the drops occurr to the user.
>>> 
>>> /Elof
>>> 
>>> 
>> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 6837 bytes
Desc: not available
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20131009/274e702e/attachment.bin>