[ARGUS] Massively incorrect packet/byte count on one connection
Carter Bullard
carter at qosient.com
Tue Jun 22 10:55:29 EDT 2021
Hey Gavin,
The way that it is suppose to work in argus-3.0.8.2+ is that argus uses long long’s (64-bit) to track packet and bytes counts (3.0.7 used ints), and when its time to write the records out on the wire or a disk, it chooses a type that is appropriate for the values in this record (bytes for values < 256, shorts for …) … the argus record structure for the metrics has bits that indicate what size the values are, so that the receiver can blow the values back up to long longs for processing. Looks like someone isn’t doing the right thing …
So double checking … all argus components in the data pipeline are the same version ???
If the none of the argus components modify the record in the pipeline, the record written to disk will be the same bits that argus generated, so we maybe able to figure out what is going on ...
If you can share the binary records that show the error, and a little context, I should be able to figure it out …
Carter
> On Jun 22, 2021, at 10:37 AM, Gavin Atkinson <gavin.atkinson at gmail.com> wrote:
>
> Hi,
>
> We have argus 3.0.8.2 running on a 20Gbps interface for traffic stats. Recently it logged this impossible entry:
>
> StartTime LastTime Flgs Proto SrcAddr Sport Dir DstAddr Dport TotPkts TotBytes State
> 05:02:01.159856 05:02:06.160526 * * t tcp NN.NN.NN.75.60319 -> NN.NN.NN.162.micro* 471763 960793012 CON
> 05:02:06.160536 05:02:11.159955 * * t tcp NN.NN.NN.75.60319 -> NN.NN.NN.162.micro* 426692 980154740 CON
> 05:02:11.165013 05:02:16.164662 * * t tcp NN.NN.NN.75.60319 -> NN.NN.NN.162.micro* 247629 3102131695 CON
> 05:02:16.166290 05:02:21.168418 * * t tcp NN.NN.NN.75.60319 -> NN.NN.NN.162.micro* 5376178237460198223 4395522059571235538 CON
> 05:02:21.168654 05:02:26.167308 * * t tcp NN.NN.NN.75.60319 -> NN.NN.NN.162.micro* 95298 1408335847 CON
>
> (other log entries are included for context - these will usually always be the same order of magnitude of packets etc). Those numbers are obviously wrong - this interface has never passed that many bytes or packets, and it works out at an average of less than one byte per packet :)
>
> The strange thing is that this interface is on an optical fibre tap and identical copies of the stream are fed to two copies of Argus 3.0.8.2, running on two physically separate machines (running different OS), and both machines logged exactly the same numbers. So it's not something that's happened on the machine (hardware failure etc), but rather appears to be some aspect of the traffic perhaps has tickled a bug?
>
> Argus is run with "-i bond1 -d -P 561 -U15", with rasplit then writing to files split on five minute boundaries. The only uncommented options in /etc/argus.conf are:
> ARGUS_FLOW_TYPE="Bidirectional"
> ARGUS_FLOW_KEY="CLASSIC_5_TUPLE"
> ARGUS_FLOW_STATUS_INTERVAL=5
> ARGUS_MAR_STATUS_INTERVAL=60
>
> Has anybody seen similar before? I'm assuming there isn't enough data in the saved ra files to reconstruct how this could have happened, but can provide cut down copies of them if useful. FWIW I'm not aware of it ever happening to us before today.
>
> Thanks,
>
> Gavin
More information about the argus
mailing list