A couple troubleshooting questions...
Carter Bullard
carter at qosient.com
Wed Jul 23 18:22:38 EDT 2014
Hey Craig,
Here are a few suggestions on what to look for. If you do find something
please send your observations to the list.
A couple of things first. Are you basing these observations on primitive
argus data (data straight from argus) or from processed data (aggregated
argus flows ??).
If these observations are coming from primitive Argus data:
the ?’s and ‘g’aps, can be indications that your Argus is either not
getting all the packets from the wire, or there is asymmetric routing,
such that all the packets don’t come down the wire/interface that your
monitoring.
Argus management records have the argus packet drop rate in them. If argus
isn’t getting all the packets, and the libpcap interface is dropping packets
then the ‘man’ record will show this. When you print the man records using
xml, it will show the number of dropped packets during the reporting interval.
ra -S argus.source -M xml - man
ra -r repository.file(s) -M xml - man
If the ‘PktsDropped’ number is gt 0, then argus is having problems keeping up
with the captured load, and the packet loss is between the libpcap interface
and argus reading packets from the interface. This is the only place where
we can directly report on packet capture infrastructure loss. If the packets
are lost in the switch that is port mirroring packets, or if they are dropped
by the sensors capture ethernet interface, there isn’t any way that we
can “ know “ that they were dropped.
The ‘g’ap tracking is our way of indicating that we are seeing gaps, which means
we didn’t see all the packets for this flow. You can print the size of the gaps,
from the TCP records “ -s +sgap +dgap” in order to understand how much we
missed, which can help in your understanding of the problem.
Because some TCP flow idle times do exceed the Argus default TCP idle time,
there will be TCP status flow records that have the ‘?’ in them. To understand
if this is the case, you have to look earlier in your archive to see if you
saw this flow before, if so, there is an answer, if not, we’re back to thinking
that we aren’t seeing all the packets.
All of that can help to figuring out how bad is the issue and where it might be.
Packet loss in the collection infrastructure is expected above 1G. If you are
port mirroring, it can be expected at any speed, depending on how the mirroring
is being implemented.
If these observations are coming from aggregated Argus data:
the 0 TCP port number and ‘g’aps can be expected when using non-default
aggregation rules. If so we have a somewhat long conversation, but
it is important to email about it, again if this the case.
Carter
On Jul 23, 2014, at 5:41 PM, Craig Merchant <craig.merchant at oracle.com> wrote:
> I’ve been trying to troubleshoot why Argus is having a tough time determining the direction of flows (approximately 40% of flows). We also seem to be seeing a fairly high number of flows with gaps (approximately 15%). Although oddly enough, only about 20% of flows with questionable direction have gaps in them.
>
> What I am seeing is that the overwhelming majority of traffic with gaps in the sequence numbers have either TCP 0 or TCP 25 as the source port or TCP 25 as the destination. After doing a little reading (http://www.lovemytool.com/blog/2013/08/the-strange-history-of-port-0-by-jim-macleod.html), TCP 0 doesn’t seem to mean that the source port was defined as 0, but that it means a Layer 4 header wasn’t included in the packet. This article implies that packet fragmentation is often a cause of this, but I’m not seeing TCP flags indicating any kind of fragmentation.
>
> What does a packet with TCP 0 as a source port mean in Argus?
>
> Is there anything special about SMTP that might generate a higher volume of gaps than other types of traffic? We’re an ESP, so we send and receive a ton of email on behalf of our customers. But I’m also not seeing gaps in other types of traffic (like HTTPS) between us and the Internet.
>
> Thanks.
>
> Craig
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20140723/181e2e99/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 455 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20140723/181e2e99/attachment.sig>
More information about the argus
mailing list