Problem with argus under load not reopening output file.

Carter Bullard carter at qosient.com
Sat Jan 10 11:01:13 EST 2009


Hey Martijn,
Sorry for the delayed response.  I've gone over the code a bit, and I  
don't
see how this jump can occur.   But that is the nature of bugs,  
sometimes.

There are a few things that I need to proceed.  What platform, OS, 64- 
bit?
What are we connecting to.  1Gbps?  10Gbps?   And, what is the processor
load for argus?  And the interrupt rate, any since as to how many  
packets
per second?

When this occurs, are we getting any packets at all? (your strace  
should have
packet reading, since we do a select() )

My suspicion is that if all is as it should be, but we all of a sudden  
get a
leap in our  global time, argus maybe so loaded that it is getting  
behind.
But, that is a very preliminary guess.


Carter

On Jan 8, 2009, at 5:41 AM, Martijn van Oosterhout wrote:

> I have some more info after managing to attach gdb to argus while it's
> in this state (see below). And have at least an explanation for why
> it's not reopening the file: the test
>
> asock->lastwrite.tv_sec < output->ArgusModel->ArgusGlobalTime.tv_sec
>
> Never becomes true because ArgusClientData->sock->lastwrite is in the
> future! About 42 minutes in the snapshot below.
>
> Which as far as I can tell is actually impossible. So it may not be a
> problem with argus directly, but something to do with timekeeping on
> the machine?
>
> Have a nice day,
>
> (gdb) p *ArgusOutputTask
> $16 = {status = 0, ArgusSrc = 0x4e6ef008, ArgusModel = 0x15bdf1b0,
> ArgusWfileList = 0x0, ArgusOutputList = 0x15bdfc80,
>  ArgusClients = 0x15bdf6c0, ArgusInitMar = 0x0, ArgusTotalRecords =
> 36439879, ArgusLastRecords = 36410503, ArgusWriteStdOut = 0,
>  ArgusOutputSequence = 36439879, ArgusPortNum = 0, ArgusLfd = {-1,
> -1, -1, -1, -1}, ArgusListens = 0, nflag = 0,
>  ArgusBindAddr = 0x0, ArgusGlobalTime = {tv_sec = 1231410233, tv_usec
> = 322693}, ArgusStartTime = {tv_sec = 1231330382,
>    tv_usec = 632431}, ArgusReportTime = {tv_sec = 1231410242, tv_usec
> = 0}, ArgusLastMarUpdateTime = {tv_sec = 1231410182,
>    tv_usec = 122701}, ArgusMarReportInterval = {tv_sec = 60, tv_usec  
> = 0}}
>
> (gdb) p *((struct ArgusClientData
> *)(ArgusOutputTask->ArgusClients->start))->sock
> $17 = {ArgusSocketList = 0x15bdfc28, fd = 5, status = 16, cnt = 0,
> expectedSize = 0, errornum = 0, ArgusLastRecord = 0,
>  ArgusReadState = 0, lastwrite = {tv_sec = 1231412781, tv_usec =
> 651709}, rec = 0x0, length = 0, writen = 0, sock = {
>    sa_family = 0, sa_data = '\0' <repeats 13 times>}, filename =
> 0x15bdfd80 "/var/log/argus/bridge0/argus.out", obj = 0x0,
>  ptr = 0x0,
>  buf = "\023 \000"...}
>
> (gdb) p *ArgusOutputTask->ArgusModel
> $18 = {state = 0, ArgusSrc = 0x4e6ef008, ArgusStatusQueue =
> 0x15bdf530, ArgusTimeOutQueues = 0x15bdf578, ArgusTimeOutQueue = {0x0,
>    0x0, 0x0, 0x0, 0x0, 0x15be03a8, 0x0, 0x0, 0x0, 0x0, 0x15be03f0,
> 0x0 <repeats 19 times>, 0x15be0360, 0x0 <repeats 29 times>},
>  ArgusOutputList = 0x15bdf478, ArgusHashTable = 0x15bdf4a8,
> ArgusThisFlow = 0x15bdf5c0, hstruct = 0x15bdf4c0,
>  ArgusTransactionNum = 27638973, ArgusThisInterface = 0,
> ArgusThisEncaps = 2, ArgusThisNetworkFlowType = 2048,
>  ArgusThisLLC = 0x15b3fbcc, ArgusThisAppFlowType = 0,
> ArgusThisMplsLabelIndex = 0, ArgusThisMplsLabel = 0,
>  ArgusThisPacket8021QEncaps = 0, ArgusFlowType = 32 ' ', ArgusFlowKey
> = 1 '\001', ArgusOptionIndicator = 0, ArgusInProtocol = 1,
>  ArgusThisDir = 0, ArgusThisStats = 0x370c742c, ArgusThisEpHdr =
> 0x4e6927c2, ArgusThisIpHdr = 0x4e6927d0, ArgusThisIpv6Frag = 0x0,
>  ArgusThisNetworkHdr = 0x0,
>  ArgusThisUpHdr = 0x4e6927f8,
>  ArgusThisSnapEnd = 0x4e692820 "\001", ArgusControlMonitor = 0,
> ArgusSnapLength = 40, ArgusGenerateTime = 0,
>  ArgusGeneratePacketSize = 0, ArgusThisLength = 1380, ArgusThisBytes
> = 1434, ArgusTotalPacket = 1139871256, ArgusTotalFrags = 0,
>  ArgusTotalIPPkts = 0, ArgusLastIPPkts = 0, ArgusTotalNonIPPkts = 0,
> ArgusLastNonIPPkts = 0, ArgusTotalNewFlows = 32432394,
>  ArgusLastNewFlows = 32416044, ArgusTotalClosedFlows = 26850027,
> ArgusLastClosedFlows = 0, ArgusTotalIPFlows = 32390005,
>  ArgusLastIPFlows = 0, ArgusTotalNonIPFlows = 42389,
> ArgusLastNonIPFlows = 0, ArgusTotalCacheHits = 1107438862,
>  ArgusTotalRecords = 0, ArgusTotalSends = 36439881,
> ArgusTotalBadSends = 2686099, ArgusLastRecords = 0,
>  ArgusTotalUpdates = 1137764966, ArgusLastUpdates = 0,
> ArgusGlobalTime = {tv_sec = 1231410233, tv_usec = 328823},
>  ArgusStartTime = {tv_sec = 0, tv_usec = 0}, ArgusNowTime = {tv_sec =
> 0, tv_usec = 0}, ArgusUpdateInterval = {tv_sec = 0,
>    tv_usec = 200000}, ArgusUpdateTimer = {tv_sec = 1231410233,
> tv_usec = 522602}, ArgusLastPacketTimer = {tv_sec = 0,
>    tv_usec = 0}, ArgusAdjustedTimer = {tv_sec = 0, tv_usec = 0},
> ArgusMajorVersion = 3, ArgusMinorVersion = 0, ArgusSnapLen = 96,
>  ArgusUserDataLen = 0, ArgusAflag = 0, ArgusTCPflag = 1, Argusmflag =
> 1, ArgusIPTimeout = 30, ArgusTCPTimeout = 30,
>  ArgusICMPTimeout = 5, ArgusIGMPTimeout = 30, ArgusFRAGTimeout = 5,
> ArgusIBTimeout = 0, ArgusReportAllTime = 1,
>  ArgusResponseStatus = 0, ArgusFarReportInterval = {tv_sec = 5,
> tv_usec = 0}, ArgusQueueInterval = {tv_sec = 0, tv_usec = 50000},
>  ArgusListenInterval = {tv_sec = 0, tv_usec = 250000}, ArgusID = 18,
> ArgusIDType = 32, ArgusSeqNum = 36439882, ArgusLocalNet = 0,
>  ArgusNetMask = 0, ArgusLink = 0}
> (gdb) cont
>
> -- 
> Martijn van Oosterhout <kleptog at gmail.com> http://svana.org/kleptog/
>

Carter Bullard
CEO/President
QoSient, LLC
150 E 57th Street Suite 12D
New York, New York  10022

+1 212 588-9133 Phone
+1 212 588-9134 Fax






More information about the argus mailing list