[Argus] Re: Packet Loss with racluster

Carter Bullard carter at qosient.com
Mon Mar 17 23:02:52 EDT 2008


Hey Guys,
There are a lot of things going on that can affect the "distribution"  
of numbers
on a time series graph, when using flow data.  Flows are not fixed  
length samples
of network activity, and so you have to do some statistical mods to  
make the data
generally useful.    Programs like rasplit() and rabins() are critical  
to distributing
load, rate, packet numbers, loss numbers, jitter, interpackt arrival  
times, etc...
correctly into timed bins.  Without the use of either rasplit() or  
rabins(), which
are split/aggregate tools, you can end up with flows that are longer  
than the
time interval its suppose to represent, which skews the data in weird  
ways, and
can generate bins with no data in them.

Loss doesn't have to be constant, and so the drop outs may actually be  
real.
And the there are no guarantees that there are actually tcp  
connections during
those intervals (no TCP, no loss), so we have to look at the data to  
see if there
is anything wrong.

Remember, flows from argus() are as long as the  
ARGUS_FAR_STATUS_INTERVAL.
A flow that starts at 1:59:59.999999, will be tallied in the 1:58:00 -  
2:00:00 bin, even
though its duration could significantly extend well into the  
2:00:00-2:02:00 interval.

The trick is to split the data into strict time slots, and then  
aggregating those slots.
rabins() is very good at this, that is why its at the heart of  
ragraph().

If I can get some of the data used to generate the graph in the email,  
I can
see if using rabins() would remove the drop outs.

Carter



On Mar 17, 2008, at 8:40 PM, Stewart Gray wrote:

> I just feed the values into cacti, it's a base metric I can use for  
> spotting anomalies. Even if it's not 100% accurate, the accuracy  
> should be pretty consistent even if argus inflates/deflates the  
> figure slightly on files which have been sliced up.
>
> I'm running this argus instance on a busy section of our network and  
> there is a constant flow of between 80-140mb/s. I ran the rate/load/ 
> loss command and got got:
>
> 17949.785637 94528448 0
>
> You can see the blips this morning. The file is actually split every  
> 2mins on this particular box.
>
> <Outlook.jpg>
>
> It's a bit unusual, if I run 'ra -m proto -s loss -r argus.arg -  
> tcp' there are quite a number of losses/retransmits. Might be an  
> issue with how racluster is aggregating these?
>
> Stew
>
> From: Nick Diel [mailto:ndiel at engr.colostate.edu]
> Sent: Tuesday, 18 March 2008 12:10 p.m.
> To: Stewart Gray
> Cc: Argus
> Subject: [Argus] Re: Packet Loss with racluster
>
> Stew,
>
> I think the first question is what are you using this number for.   
> If you are just using it as an indicator of congestion or other  
> network problems then the 5 minute boundary will most likely not be  
> a problem.
>
> I believe Argus just counts the number of retransmitted packets to  
> get a loss/drop count, I don't think it is doing any triple  
> duplicate ack or tcp timeout checks (if I am wrong, someone please  
> say so).  Since retransmissions will occur in a time window of a few  
> seconds, you should capture most retransmitted packets in your 5  
> minute boundaries.  So even if a flow cross that boundary, you still  
> have a good chance of counting retransmitted packets correctly.
>
> For cases you are receiving a count of 0, I would look at packet  
> rate and bit rate, it is possible the link just doesn't have much  
> traffic on it at that time. racluster -m proto -s rate load loss -r  
> argus.arg - tcp
>
> Though I did notice something unusual on my end.  The command I gave  
> you, should be a strong estimate, but doesn't account for  
> retransmitted packets over status flow boundaries within the file  
> (though same argument above applies).  So to get an exact count on  
> the file (assuming racluster reanalyzes the status flow records for  
> retransmissions) you would need something like: racluster -r  
> argus.arg -w - - tcp | racluster -m proto -s loss -r - (first merge  
> status flow records, then count retransmitted packets).  Though this  
> is the output I get:
>
> racluster -m proto -s loss -r argus.out - tcp
>      62521
> racluster -r argus.out -w - - tcp | racluster -m proto -s loss -r -
>      60047
>
> At a minimum I would expect the numbers to stay the same, no  
> retransmitted packets crossed any status flows or racluster doesn't  
> try to find any new retransmitted packets.  The number going down  
> doesn't make any sense to me.  Maybe someone can explain what is  
> going on to me.
>
> Nick[
>
>
>
>
> Stewart Gray wrote:
>>
>> Hey Guys,
>>
>> How does racluster handle argus files which have been periodically  
>> split, when producing packet loss statistics? My monitoring machine  
>> rotates the argus file every 5minutes. When using the following  
>> command, how skewed are the figures going to be as a result of  
>> having an incomplete argus file (ie connections that were current  
>> when the log file was rotated).
>>
>> I'm also note than sometimes the resulting figure is 0. It only  
>> seems to do this in about 1/10 argus files I run the command at.
>>
>> racluster -m proto -s loss -r argus.arg - tcp
>> 0
>>
>> racluster -m proto -s loss -r argus.arg - tcp
>> 33036
>> Any ideas?
>>
>> Cheers,
>>
>> Stew
>>
>> From: Nick Diel [mailto:ndiel at engr.colostate.edu]
>> Sent: Wednesday, 12 March 2008 10:24 a.m.
>> To: Stewart Gray
>> Cc: Argus
>> Subject: Re: [ARGUS] Cheat sheet premiere
>>
>> How about:
>> racluster -m proto -s loss -r argus.arg - tcp
>>
>> This should merge all records based on protocol (in this case only  
>> tcp because of the filter) and then print the loss column of all  
>> merged records.
>>
>> Nick
>>
>> Stewart Gray wrote:
>>>
>>> awesome, That's a really good start. I've already been playing  
>>> with a few of the options I hadn't toyed with before :)
>>>
>>> Is there an easy way to generate a raw count of packets loss/ 
>>> retransmitted rather than having it graphed?
>>>
>>> I figure we start with:
>>>
>>> racluster -s loss -r argus.arg -w -
>>>
>>> How are the figured totaled? Do we pipe it to rasort or ra?
>>>
>>> Thanks,
>>>
>>> Stewart
>>>
>>> From: Stéphane Peters [mailto:stephane.peters at forem.be]
>>> Sent: Saturday, 8 March 2008 11:06 a.m.
>>> To: Carter Bullard
>>> Cc: Stewart Gray; Argus
>>> Subject: Re: Re: [ARGUS] Cheat sheet premiere
>>>
>>> Hi Carter,
>>>
>>> I would love to see such a sheet in the distribution,
>>> and I also was hoping that you could check,
>>> if those examples made sense or were appropriate.
>>> So please go on !
>>>
>>>
>>> Some cosmetic work could be done too;
>>> for example to use everywhere some "standard" parameters like this  
>>> one :
>>>     file=argus-eth1.out
>>>     ra -r $file
>>> so it is easy to paste the line "as is".
>>> without forgetting the shell escapes ( \$srcid) like in:
>>>     rasplit -S $argushost  -M 1d -w /path/argus-\$srcid.%Y.%m.%d.log
>>>
>>> By the way, as another example given to the list, here are 3  
>>> scripts I use.
>>> The PATH vars permit to have a nicer ps(1) output.
>>>
>>> start-argus
>>>> #!/bin/sh
>>>> interf=eth1
>>>> PATH=/sbin ifconfig $interf | grep UP || PATH=/sbin ifconfig  
>>>> $interf up
>>>> PATH=/usr/local/sbin argus -d -i $interf -e `hostname` -P 561 - 
>>>> U128 -mRS 30 -w argus-eth1.out
>>>
>>> rotate:
>>>> #!/bin/sh
>>>>
>>>> # Rotates server log files, without affecting users who may be
>>>> # connected to the server.
>>>>
>>>> # This can be run as a cron script
>>>>
>>>> DATE=`date +%Y-%m%d-%H%M`
>>>> LOGS='argus-eth1.out'
>>>>
>>>>  for i in $LOGS; do
>>>>    if [ -f $i ]; then
>>>>      mv $i $i.$DATE
>>>>      gzip -9 $i.$DATE
>>>>    fi
>>>>  done
>>>
>>> rotate-daily
>>>> #!/bin/sh
>>>> ./rotate
>>>> sleep 60 # sometimes the preceding command finishes too early
>>>> echo ./rotate-daily | at 0000 > /tmp/rotate-daily.log
>>>
>>> I use at(1) instead of cron(8) to cut the files closer to midnight.,
>>> but rastream(1)'s extended "-w" option seems promising.
>>> A better solution could be to use argus(8) to preprocess the flows,
>>> and rastream(1). to write, "rotate" and compress the files.
>>> Another thread, perhaps.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Carter Bullard wrote :
>>>>
>>>> Hey Stephane,
>>>> This is great!!!!  I'll put this in the distribution, if you  
>>>> don't mind!!!!
>>>> And I'll also go through it to make sure that any changes in the
>>>> code actually don't break this, and I can add some of the ones
>>>> that I do.
>>>>
>>>> So Russell is asking for a wiki, and we already have one at:
>>>>
>>>> http://www.vorant.com/nsmwiki/index.php?title=Argus
>>>>
>>>>
>>>> Carter
>>>>
>>>>
>>>>
>>>>
>>>> On Mar 7, 2008, at 2:24 PM, Stéphane Peters wrote:
>>>>
>>>>> Hi Stewart,
>>>>>
>>>>> I also think that a cheat sheet would be nice !
>>>>> Here is a good occasion to show mine...
>>>>>
>>>>> Please note, most of the stuff has been collected right from  
>>>>> this argus list,
>>>>> so hopefully, you shouldn't browse all the (numerous) past  
>>>>> messages.
>>>>>
>>>>> Any suggestions ?
>>>>>
>>>>> flow filtering on certain port range:
>>>>>    ra -r file - dst port \( gt 1024 and lt 2048 \)
>>>>> (...)
>>>>>
>>>>>
>>>>> Stewart Gray a écrit :
>>>>>>
>>>>>> awesome, that's more like what I was after :) Thanks for your  
>>>>>> help
>>>>>> again.
>>>>>>
>>>>>> As I mentioned earlier, I reckon it'd be neat to have some sort  
>>>>>> of cheat
>>>>>> sheet for doing common tasks. I bet there's lot's of stuff you  
>>>>>> know that
>>>>>> others don't, having written the application yourself. I don't  
>>>>>> know what
>>>>>> I don't know!
>>>>>>
>>>>>
>>>>> Regards,
>>>>> -- 
>>>>> Stephane.Peters at forem.be, Postmaster at forem.be
>>>>
>>>
>>> Regards,
>>> -- 
>>> Stephane.Peters at forem.be
>>> #####################################################################################
>>> Important: This electronic message and attachments (if any) are  
>>> confidential and may be legally privileged. If you are not the  
>>> intended recipient do not copy, disclose or use the contents in  
>>> any way. Please let us know by return e-mail immediately and then  
>>> destroy this message.
>>> #####################################################################################
>>
>> #####################################################################################
>> Important: This electronic message and attachments (if any) are  
>> confidential and may be legally privileged. If you are not the  
>> intended recipient do not copy, disclose or use the contents in any  
>> way. Please let us know by return e-mail immediately and then  
>> destroy this message.
>> #####################################################################################
>
> #####################################################################################
> Important: This electronic message and attachments (if any) are  
> confidential and may be legally privileged. If you are not the  
> intended recipient do not copy, disclose or use the contents in any  
> way. Please let us know by return e-mail immediately and then  
> destroy this message.
> #####################################################################################

Carter Bullard
CEO/President
QoSient, LLC
150 E. 57th Street Suite 12D
New York, New York 10022

+1 212 588-9133 Phone
+1 212 588-9134 Fax



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20080317/0e118e83/attachment.html>


More information about the argus mailing list