argus crash problem

Carter Bullard carter at qosient.com
Wed Jan 30 17:53:14 EST 2008


Hey Lei,
You should print both spkts and dpkts, so you can see if you're  
getting full-duplex
flow monitoring.  Running with debug information is probably not going  
to
shed any light on the problem you ran into.   Run without the option  
to see
how argus performs on your system.  Depending on the type of dag card,  
you
could saturate the BUS around 2Gbps.  If you have the the PCI express  
cards,
you still may not get  10Gbps, depending on the motherboard you are  
using.

Just run without the "-D" option for a while to see if you get the bug  
again.
Also try modifying the flow model using the /etc/argus.conf file to  
see if
you can get the pcu utilization down.

Carter


On Jan 30, 2008, at 5:25 PM, Lei Wei wrote:

> You are right. I took a look at the flow. I could see some full tcp  
> connections,i.e.from REQ,CON to FIN. But it seems that many  
> connections have unknown directions like below. Is that normal?
>
> 15:58:19.299491  e            tcp     208.107.16.125.2740     <? 
> >     152.23.105.115.6881          2        121   CON
>  15:58:19.299491  e         udp     128.109.230.21.10002     - 
> >       233.4.200.18.10002         2        136   INT
>  15:58:19.299511  e         udp     155.101.127.30.10002     - 
> >       233.4.200.18.10002         3        204   INT
>  15:58:19.299539  e s       tcp     207.251.97.202.54174    <? 
> >        152.2.2.149.smtp         26      22585   CON
>  15:58:19.299551  e         udp     90.196.179.159.5331     <->      
> 152.23.124.201.9965          7       4216   CON
>  15:58:19.299616  e         udp        152.2.122.3.1179      ->       
> 118.170.8.216.64371         9        563   INT
>  15:58:19.299618  e         tcp      81.52.130.186.http     <?>      
> 152.23.110.254.2068         10       9275   CON
>  15:58:19.299634  e         tcp     152.23.211.119.52335    <?>       
> 59.134.50.151.3830          3       1412   CON
>  15:58:19.299663  e s       tcp     76.182.114.156.2093     <? 
> >       152.2.210.22.rdp           6        558   CON
>  15:58:19.299676  e         tcp      66.249.93.109.imaps    <? 
> >       152.23.88.57.1055          3        263   CON
>  15:58:19.299677  e s       tcp        74.125.1.81.http     <? 
> >       152.2.176.13.3282         63      64848   CON
>  15:58:19.299690  e         tcp       152.2.62.113.here-l   <? 
> >       204.10.66.76.http         12       7402   CON
>  15:58:19.299690  e         tcp      90.198.223.64.2684      ?>       
> 152.23.79.176.16328         2       3028   CON
>  15:58:19.299739  e         tcp     63.146.183.217.http     <?>      
> 204.85.193.203.16244        43      37784   CON
>  15:58:19.299778  e         tcp      64.233.179.91.http      ?>       
> 152.23.93.163.4708          2       2968   CON
>  15:58:19.299804  e         tcp         152.2.1.95.smtp     <? 
> >        62.43.82.68.50100         6       2096   CON
>  15:58:19.299823  e         tcp      81.52.130.152.http      ?>       
> 152.23.69.105.1232          1         60   CON
>  15:58:19.299862  e         tcp     195.241.129.86.14816    <? 
> >        152.2.32.11.53759        67      25519   CON
>  15:58:19.299891  e         tcp     81.154.227.107.51157    <?>      
> 152.23.126.198.12215         3       1834   CON
>  15:58:19.299892  e d       tcp     208.253.91.250.40767    <? 
> >        152.2.63.68.rtsp         17       7464   CON
>  15:58:19.299893  e         tcp      84.67.187.146.52319    <?>       
> 152.23.63.139.2077          5       3206   CON
>  15:58:19.299893  e         tcp    124.121.160.139.16800    <?>      
> 152.23.204.249.4173          2       1560   CON
>  15:58:19.299928  e         udp      152.2.255.242.10000    <- 
> >       76.182.85.17.10000       114      94428   CON
>  15:58:19.299938  e         tcp       152.23.69.88.3038     <?>       
> 206.33.52.123.http         68      68056   CON
>
>
>
>
> Quoting Carter Bullard <carter at qosient.com>:
>
>> One important question is whether the flows look reasonable.
>> Full tcp connections, bidirectional etc...
>> Carter
>>
>> On Jan 30, 2008, at 3:57 PM, Lei Wei wrote:
>>
>>> Argus didn't stop since the file is still growing and the memory   
>>> usage showing in top is about 493mb. Since you mentioned that  
>>> argus  slows down when it's in debug mode, does it mean that it  
>>> might drop  some packets when running?
>>>
>>> It's a little bit wierd because when I asked it to output the  
>>> debug  info to the stdout, it outputs many just like you did.  But  
>>> when I  tried to redirect it to write into a file using "argus -i  
>>> dag0 -w  data.out -D 2 > error 2>&1 ", it only outputs several  
>>> lines into the  file:
>>>
>>> argus[574]: 28 Jan 08 01:14:13.680343 ArgusNewModeler() returning   
>>> 0x816f000
>>> argus[574]: 28 Jan 08 01:14:13.686915 ArgusNewSource() returning   
>>> 0x81b2000
>>> argus[574]: 28 Jan 08 01:14:13.686963 ArgusNewOutput() returning   
>>> retn 0x81b1100
>>> argus[574]: 28 Jan 08 01:14:13.708239 setArgusID(0x816f000,   
>>> 0x9802899f) done
>>> argus[574]: 28 Jan 08 01:14:13.708311 setArgusPortNum(561) returning
>>> argus[574]: 28 Jan 08 01:14:13.708512 setArgusInterfaceStatus(1)
>>>
>>> I guess maybe I should run it again and just let it output the  
>>> debug  info on the screen and see what it'll happen.
>>>
>>> Lei
>>>
>>>
>>> Quoting Peter Van Epp <vanepp at sfu.ca>:
>>>
>>>> On Wed, Jan 30, 2008 at 01:45:17PM -0500, Lei Wei wrote:
>>>>> Hello Peter and Carter,
>>>>>
>>>>> I've been trying to repeat the crash problem again recently. I ran
>>>>> argus with the debug option enabled from Sunday night until now.  
>>>>> It
>>>>> never crashed. The only debug information it outputed was  
>>>>> several  lines
>>>>> when argus started to run and nothing more. The CPU usage right   
>>>>> now is
>>>>> till about 99% but the output data file grows well and now is  
>>>>> about
>>>>> 144G already. I'm not sure why I can't repeat the crash again  
>>>>> and  there
>>>>> shouldn't be too much difference in the amount of traffic  
>>>>> between  this
>>>>> and last week. By the way, the link I'm monitoring is the  
>>>>> campus  border
>>>>> link to the ISP and the traffic load is more than 1G per  
>>>>> second.  I
>>>>> wonder if there's any difference running in debug mode that could
>>>>> possibly play a role in it...  Anyway, I'll keep monitoring it  
>>>>> and  see
>>>>> what happens.
>>>>>
>>>>> Lei
>>>>>
>>>>
>>>> 	Hmmm, thats too bad. Debug mode does slow argus down a bit, its
>>>> possible that its slowing it down enough that the problem isn't   
>>>> happening.
>>>> What does memory usage look like? I'd be suspicious that what is   
>>>> happening
>>>> is that argus is running out of memory and stopping (although  
>>>> that  should
>>>> show up in syslog). That said I just noticed I'm not getting any   
>>>> syslog
>>>> messages from argus (and I think I should be). I just restarted  
>>>> one  of my
>>>> test ones with -D2:
>>>>
>>>> # argus -J -P 560 -i eth2 -i eth3 -m -U700 -D2  -- host   
>>>> 142.58.101.100
>>>> argus[4637]: 30 Jan 08 12:32:05.555408 ArgusNewModeler()  
>>>> returning  0x10181010
>>>> argus[4637]: 30 Jan 08 12:32:05.555871 ArgusNewSource()  
>>>> returning  0x10201f70
>>>> argus[4637]: 30 Jan 08 12:32:05.555903 ArgusNewOutput()  
>>>> returning  retn 0x102016b0
>>>> argus[4637]: 30 Jan 08 12:32:05.555985 setArgusPortNum(560)  
>>>> returning
>>>> argus[4637]: 30 Jan 08 12:32:05.556023 setArgusInterfaceStatus(1)
>>>> argus[4637]: 30 Jan 08 12:32:05.585055 ArgusInitSource()   
>>>> pcap_open_live(eth3) returned 0x10252d60
>>>> argus[4637]: 30 Jan 08 12:32:05.585119   
>>>> ArgusOpenInterface(0x10201f70, 'eth3') returning
>>>> argus[4637]: 30 Jan 08 12:32:05.615008 ArgusInitSource()   
>>>> pcap_open_live(eth2) returned 0x10253370
>>>> argus[4637]: 30 Jan 08 12:32:05.615071   
>>>> ArgusOpenInterface(0x10201f70, 'eth2') returning
>>>> argus[4637]: 30 Jan 08 12:32:05.615119 ArgusCopyArgv(0xff8811f8)   
>>>> returning 0x10253980
>>>> argus[4637]: 30 Jan 08 12:32:05.615298 ArgusInitSource() returning
>>>> argus[4637]: 30 Jan 08 12:32:05.615357 ArgusEstablishListen(560,   
>>>> (null), 0xff880a18)
>>>> argus[4637]: 30 Jan 08 12:32:05.615526 ArgusEstablishListen(560,   
>>>> 0xff880a18) binding: any:560
>>>> argus[4637]: 30 Jan 08 12:32:05.615607 ArgusEstablishListen(560,   
>>>> 0xff880a18) returning 5
>>>> argus[4637]: 30 Jan 08 12:32:05.615633 ArgusInitOutput() done
>>>> ArgusWarning: argus[4637]: 30 Jan 08 12:32:05.615670 started
>>>> argus[4637]: 30 Jan 08 12:32:05.615711 ArgusInitModeler() done
>>>> argus[4637]: 30 Jan 08 12:32:05.615974 setArgusInterfaceStatus(1)
>>>> ArgusWarning: argus[4637]: 30 Jan 08 12:32:05.616010   
>>>> ArgusGetInterfaceStatus: interface eth3 is up
>>>> argus[4637]: 30 Jan 08 12:32:05.616046 setArgusInterfaceStatus(1)
>>>> ArgusWarning: argus[4637]: 30 Jan 08 12:32:05.616102   
>>>> ArgusGetInterfaceStatus: interface eth2 is up
>>>> argus[4637]: 30 Jan 08 12:32:05.616133 setArgusInterfaceStatus(1)
>>>> argus[4637]: 30 Jan 08 12:32:05.616159 setArgusInterfaceStatus(1)
>>>> argus[4637]: 30 Jan 08 12:32:59.824738 ArgusScheduleShutDown(2)
>>>>
>>>> argus[4637]: 30 Jan 08 12:32:59.824838 ArgusShutDown(Normal  
>>>> Shutdown)
>>>>
>>>> argus[4637]: 30 Jan 08 12:32:59.824856  
>>>> ArgusCloseSource(0x10201f70)  starting
>>>> argus[4637]: 30 Jan 08 12:32:59.824874  
>>>> ArgusCloseSource(0x10201f70)  deleting source
>>>> argus[4637]: 30 Jan 08 12:32:59.825600   
>>>> ArgusCloseModeler(0x10181010) pushing close record 0x1028f2b0
>>>> argus[4637]: 30 Jan 08 12:32:59.825625  
>>>> ArgusCloseModeler(0x10181010)
>>>> argus[4637]: 30 Jan 08 12:32:59.825642 ArgusCloseOutput()   
>>>> scheduling closure after writing records
>>>> argus[4637]: 30 Jan 08 12:32:59.825698 ArgusOutputProcess()   
>>>> received stop record 0 records on the list
>>>> argus[4637]: 30 Jan 08 12:32:59.825716  
>>>> ArgusCloseOutput(0x102016b0)  done
>>>> argus[4637]: 30 Jan 08 12:32:59.825759 ArgusShutDown()
>>>>
>>>> # tail /var/log/messages
>>>> ...
>>>> Jan 30 12:32:05 sniffer1 kernel: RING: succesfully allocated 0  
>>>> KB  [tot_mem=12664896][order=12]
>>>> Jan 30 12:32:05 sniffer1 kernel: RING: allocated 10851 slots   
>>>> [slot_len=1546][tot_mem=16777216]
>>>> Jan 30 12:32:05 sniffer1 kernel: RING: succesfully allocated 0  
>>>> KB  [tot_mem=12664896][order=12]
>>>> Jan 30 12:32:05 sniffer1 kernel: RING: allocated 10851 slots   
>>>> [slot_len=1546][tot_mem=16777216]
>>>>
>>>> 	and all that shows in syslog is the ring buffer starting as the
>>>> interfaces come up. I think argus used to syslog the interfaces   
>>>> opening to
>>>> syslog which may mean there is a bug in syslogging.
>>>>
>>>> Peter Van Epp / Operations and Technical Support
>>>> Simon Fraser University, Burnaby, B.C. Canada
>>>>
>>>
>>>
>>>
>>
>
>
>



More information about the argus mailing list