argus crash problem

Lei Wei lwei at cs.unc.edu
Fri Feb 1 12:01:27 EST 2008


Hi Carter,

Here's the result:

(gdb) where
#0  0x0804e0a1 in ArgusProcessGreHdr (model=0x816f000, ip=0x2ccef828,   
  length=41) at ArgusModeler.c:694
#1  0x0804dee4 in ArgusProcessPacketHdrs (model=0x816f000,    
p=0x2ccef828 <Address 0x2ccef828 out of bounds>, length=61, type=2048)
    at ArgusModeler.c:606
#2  0x0804f041 in ArgusProcessPacket (src=0x81b2000,    p=0x2ccef81a 
<Address 0x2ccef81a out of bounds>, length=75, tvp=0x82073d0,    
type=2048) at ArgusModeler.c:1220
#3  0x08056622 in ArgusEtherPacket (user=0x81b2000 "", h=0x82073d0,    
p=0x2ccef81a <Address 0x2ccef81a out of bounds>) at ArgusSource.c:700
#4  0x08058be5 in ArgusGetPackets (src=0x81b2000) at ArgusSource.c:1834
#5  0x0804b7b9 in main (argc=5, argv=0xbfbfec68) at argus.c:567

(gdb) print bp
$1 = 0x2f1f5012 <Address 0x2f1f5012 out of bounds>
(gdb) print grelen
$2 = 35290742
(gdb) print flags
$3 = 25872
(gdb) print model->ArgusSnapLength
$4 = 36
(gdb) print model->ArgusThisLength
$5 = 41

Please let me know anything else I could help with.

Thanks.
Lei




Quoting Carter Bullard <carter at qosient.com>:

> Hey Lei,
> So, type this in gdb()>
>
>    (gdb) where
>    (gdb) print bp
>    (gdb) print grelen
>    (gdb) print flags
>    (gdb) print model->ArgusSnapLength
>    (gdb) print model->ArgusThisLength
>
> That will help a lot.
> Carter
>
>
> On Jan 31, 2008, at 11:48 PM, Lei Wei wrote:
>
>> Hi Carter,
>>
>> Here's after I run gdb on the core dump file:
>>
>> GNU gdb 6.1.1 [FreeBSD]
>> Copyright 2004 Free Software Foundation, Inc.
>> GDB is free software, covered by the GNU General Public License, and 
>>  you are
>> welcome to change it and/or distribute copies of it under certain  
>> conditions.
>> Type "show copying" to see the conditions.
>> There is absolutely no warranty for GDB.  Type "show warranty" for  details.
>> This GDB was configured as "i386-marcel-freebsd"...
>> Core was generated by `argus'.
>> Program terminated with signal 11, Segmentation fault.
>> Reading symbols from /usr/lib/libwrap.so.3...done.
>> Loaded symbols for /usr/lib/libwrap.so.3
>> Reading symbols from /lib/libm.so.3...done.
>> Loaded symbols for /lib/libm.so.3
>> Reading symbols from /lib/libc.so.5...done.
>> Loaded symbols for /lib/libc.so.5
>> Reading symbols from /libexec/ld-elf.so.1...done.
>> Loaded symbols for /libexec/ld-elf.so.1
>> #0  0x0804e0a1 in ArgusProcessGreHdr (model=0x816f000,  
>> ip=0x2ccef828,    length=41) at ArgusModeler.c:694
>> 694                    af = EXTRACT_16BITS(bp);
>> (gdb)
>>
>>
>> Thanks.
>> Lei
>>
>>
>>
>>
>>
>> Quoting Carter Bullard <carter at qosient.com>:
>>
>>> If you compiled argus with the correct options, you can
>>> run gdb() against the binary and the core file and it will
>>> tell you what the problem was.
>>>
>>> What happens when you run:
>>>   gdb argus argus.core
>>>
>>> ?
>>>
>>> Carter
>>>
>>>
>>> On Jan 31, 2008, at 7:47 PM, Lei Wei wrote:
>>>
>>>> Hey Carter,
>>>>
>>>> I ran argus again without the -D option and it finally core  
>>>> dumped.  The argus.core is about 350mb. Any comments on what we  
>>>> could do  about this?
>>>>
>>>> Thanks.
>>>> Lei
>>>>
>>>>
>>>> Quoting Carter Bullard <carter at qosient.com>:
>>>>
>>>>> Hey Lei,
>>>>> You should print both spkts and dpkts, so you can see if you're   
>>>>>  getting full-duplex
>>>>> flow monitoring.  Running with debug information is probably not  
>>>>>  going  to
>>>>> shed any light on the problem you ran into.   Run without the   
>>>>> option  to see
>>>>> how argus performs on your system.  Depending on the type of dag  
>>>>>  card,  you
>>>>> could saturate the BUS around 2Gbps.  If you have the the PCI   
>>>>> express  cards,
>>>>> you still may not get  10Gbps, depending on the motherboard you   
>>>>> are  using.
>>>>>
>>>>> Just run without the "-D" option for a while to see if you get  
>>>>> the  bug  again.
>>>>> Also try modifying the flow model using the /etc/argus.conf file  
>>>>>  to  see if
>>>>> you can get the pcu utilization down.
>>>>>
>>>>> Carter
>>>>>
>>>>>
>>>>> On Jan 30, 2008, at 5:25 PM, Lei Wei wrote:
>>>>>
>>>>>> You are right. I took a look at the flow. I could see some full  
>>>>>>  tcp  connections,i.e.from REQ,CON to FIN. But it seems that  
>>>>>> many   connections have unknown directions like below. Is that  
>>>>>> normal?
>>>>>>
>>>>>> 15:58:19.299491  e            tcp     208.107.16.125.2740      
>>>>>> <?  >     152.23.105.115.6881          2        121   CON
>>>>>> 15:58:19.299491  e         udp     128.109.230.21.10002     -   
>>>>>> >       233.4.200.18.10002         2        136   INT
>>>>>> 15:58:19.299511  e         udp     155.101.127.30.10002     -   
>>>>>> >       233.4.200.18.10002         3        204   INT
>>>>>> 15:58:19.299539  e s       tcp     207.251.97.202.54174    <?   
>>>>>> >        152.2.2.149.smtp         26      22585   CON
>>>>>> 15:58:19.299551  e         udp     90.196.179.159.5331     <-  > 
>>>>>>      152.23.124.201.9965          7       4216   CON
>>>>>> 15:58:19.299616  e         udp        152.2.122.3.1179      -  > 
>>>>>>       118.170.8.216.64371         9        563   INT
>>>>>> 15:58:19.299618  e         tcp      81.52.130.186.http     <?  > 
>>>>>>      152.23.110.254.2068         10       9275   CON
>>>>>> 15:58:19.299634  e         tcp     152.23.211.119.52335    <?  > 
>>>>>>       59.134.50.151.3830          3       1412   CON
>>>>>> 15:58:19.299663  e s       tcp     76.182.114.156.2093     <?   
>>>>>> >       152.2.210.22.rdp           6        558   CON
>>>>>> 15:58:19.299676  e         tcp      66.249.93.109.imaps    <?   
>>>>>> >       152.23.88.57.1055          3        263   CON
>>>>>> 15:58:19.299677  e s       tcp        74.125.1.81.http     <?   
>>>>>> >       152.2.176.13.3282         63      64848   CON
>>>>>> 15:58:19.299690  e         tcp       152.2.62.113.here-l   <?   
>>>>>> >       204.10.66.76.http         12       7402   CON
>>>>>> 15:58:19.299690  e         tcp      90.198.223.64.2684      ?  > 
>>>>>>       152.23.79.176.16328         2       3028   CON
>>>>>> 15:58:19.299739  e         tcp     63.146.183.217.http     <?  > 
>>>>>>      204.85.193.203.16244        43      37784   CON
>>>>>> 15:58:19.299778  e         tcp      64.233.179.91.http      ?  > 
>>>>>>       152.23.93.163.4708          2       2968   CON
>>>>>> 15:58:19.299804  e         tcp         152.2.1.95.smtp     <?   
>>>>>> >        62.43.82.68.50100         6       2096   CON
>>>>>> 15:58:19.299823  e         tcp      81.52.130.152.http      ?  > 
>>>>>>       152.23.69.105.1232          1         60   CON
>>>>>> 15:58:19.299862  e         tcp     195.241.129.86.14816    <?   
>>>>>> >        152.2.32.11.53759        67      25519   CON
>>>>>> 15:58:19.299891  e         tcp     81.154.227.107.51157    <?  > 
>>>>>>      152.23.126.198.12215         3       1834   CON
>>>>>> 15:58:19.299892  e d       tcp     208.253.91.250.40767    <?   
>>>>>> >        152.2.63.68.rtsp         17       7464   CON
>>>>>> 15:58:19.299893  e         tcp      84.67.187.146.52319    <?  > 
>>>>>>       152.23.63.139.2077          5       3206   CON
>>>>>> 15:58:19.299893  e         tcp    124.121.160.139.16800    <?  > 
>>>>>>      152.23.204.249.4173          2       1560   CON
>>>>>> 15:58:19.299928  e         udp      152.2.255.242.10000    <-   
>>>>>> >       76.182.85.17.10000       114      94428   CON
>>>>>> 15:58:19.299938  e         tcp       152.23.69.88.3038     <?  > 
>>>>>>       206.33.52.123.http         68      68056   CON
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Quoting Carter Bullard <carter at qosient.com>:
>>>>>>
>>>>>>> One important question is whether the flows look reasonable.
>>>>>>> Full tcp connections, bidirectional etc...
>>>>>>> Carter
>>>>>>>
>>>>>>> On Jan 30, 2008, at 3:57 PM, Lei Wei wrote:
>>>>>>>
>>>>>>>> Argus didn't stop since the file is still growing and the   
>>>>>>>> memory   usage showing in top is about 493mb. Since you   
>>>>>>>> mentioned that  argus  slows down when it's in debug mode,  
>>>>>>>> does  it mean that it  might drop  some packets when running?
>>>>>>>>
>>>>>>>> It's a little bit wierd because when I asked it to output  the 
>>>>>>>>   debug  info to the stdout, it outputs many just like you  
>>>>>>>> did.   But  when I  tried to redirect it to write into a file  
>>>>>>>> using  "argus -i  dag0 -w  data.out -D 2 > error 2>&1 ", it  
>>>>>>>> only  outputs several  lines into the  file:
>>>>>>>>
>>>>>>>> argus[574]: 28 Jan 08 01:14:13.680343 ArgusNewModeler()   
>>>>>>>> returning   0x816f000
>>>>>>>> argus[574]: 28 Jan 08 01:14:13.686915 ArgusNewSource()   
>>>>>>>> returning   0x81b2000
>>>>>>>> argus[574]: 28 Jan 08 01:14:13.686963 ArgusNewOutput()   
>>>>>>>> returning   retn 0x81b1100
>>>>>>>> argus[574]: 28 Jan 08 01:14:13.708239 setArgusID(0x816f000,    
>>>>>>>>  0x9802899f) done
>>>>>>>> argus[574]: 28 Jan 08 01:14:13.708311 setArgusPortNum(561)   returning
>>>>>>>> argus[574]: 28 Jan 08 01:14:13.708512 setArgusInterfaceStatus(1)
>>>>>>>>
>>>>>>>> I guess maybe I should run it again and just let it output  
>>>>>>>> the   debug  info on the screen and see what it'll happen.
>>>>>>>>
>>>>>>>> Lei
>>>>>>>>
>>>>>>>>
>>>>>>>> Quoting Peter Van Epp <vanepp at sfu.ca>:
>>>>>>>>
>>>>>>>>> On Wed, Jan 30, 2008 at 01:45:17PM -0500, Lei Wei wrote:
>>>>>>>>>> Hello Peter and Carter,
>>>>>>>>>>
>>>>>>>>>> I've been trying to repeat the crash problem again recently.  I  ran
>>>>>>>>>> argus with the debug option enabled from Sunday night until  
>>>>>>>>>>  now.  It
>>>>>>>>>> never crashed. The only debug information it outputed was    
>>>>>>>>>> several  lines
>>>>>>>>>> when argus started to run and nothing more. The CPU usage   
>>>>>>>>>> right   now is
>>>>>>>>>> till about 99% but the output data file grows well and now  
>>>>>>>>>> is   about
>>>>>>>>>> 144G already. I'm not sure why I can't repeat the crash  
>>>>>>>>>> again   and  there
>>>>>>>>>> shouldn't be too much difference in the amount of traffic    
>>>>>>>>>> between  this
>>>>>>>>>> and last week. By the way, the link I'm monitoring is the    
>>>>>>>>>> campus  border
>>>>>>>>>> link to the ISP and the traffic load is more than 1G per    
>>>>>>>>>> second.  I
>>>>>>>>>> wonder if there's any difference running in debug mode that   could
>>>>>>>>>> possibly play a role in it...  Anyway, I'll keep monitoring  
>>>>>>>>>>  it  and  see
>>>>>>>>>> what happens.
>>>>>>>>>>
>>>>>>>>>> Lei
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 	Hmmm, thats too bad. Debug mode does slow argus down a bit,  its
>>>>>>>>> possible that its slowing it down enough that the problem   
>>>>>>>>> isn't   happening.
>>>>>>>>> What does memory usage look like? I'd be suspicious that  
>>>>>>>>> what  is   happening
>>>>>>>>> is that argus is running out of memory and stopping  
>>>>>>>>> (although   that  should
>>>>>>>>> show up in syslog). That said I just noticed I'm not getting  
>>>>>>>>>  any   syslog
>>>>>>>>> messages from argus (and I think I should be). I just   
>>>>>>>>> restarted  one  of my
>>>>>>>>> test ones with -D2:
>>>>>>>>>
>>>>>>>>> # argus -J -P 560 -i eth2 -i eth3 -m -U700 -D2  -- host     
>>>>>>>>> 142.58.101.100
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.555408 ArgusNewModeler()    
>>>>>>>>> returning  0x10181010
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.555871 ArgusNewSource()    
>>>>>>>>> returning  0x10201f70
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.555903 ArgusNewOutput()    
>>>>>>>>> returning  retn 0x102016b0
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.555985 setArgusPortNum(560)   
>>>>>>>>>  returning
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.556023  setArgusInterfaceStatus(1)
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.585055 ArgusInitSource()     
>>>>>>>>> pcap_open_live(eth3) returned 0x10252d60
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.585119     
>>>>>>>>> ArgusOpenInterface(0x10201f70, 'eth3') returning
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.615008 ArgusInitSource()     
>>>>>>>>> pcap_open_live(eth2) returned 0x10253370
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.615071     
>>>>>>>>> ArgusOpenInterface(0x10201f70, 'eth2') returning
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.615119   
>>>>>>>>> ArgusCopyArgv(0xff8811f8)   returning 0x10253980
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.615298 ArgusInitSource()   returning
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.615357   
>>>>>>>>> ArgusEstablishListen(560,   (null), 0xff880a18)
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.615526   
>>>>>>>>> ArgusEstablishListen(560,   0xff880a18) binding: any:560
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.615607   
>>>>>>>>> ArgusEstablishListen(560,   0xff880a18) returning 5
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.615633 ArgusInitOutput() done
>>>>>>>>> ArgusWarning: argus[4637]: 30 Jan 08 12:32:05.615670 started
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.615711 ArgusInitModeler() done
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.615974  setArgusInterfaceStatus(1)
>>>>>>>>> ArgusWarning: argus[4637]: 30 Jan 08 12:32:05.616010     
>>>>>>>>> ArgusGetInterfaceStatus: interface eth3 is up
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.616046  setArgusInterfaceStatus(1)
>>>>>>>>> ArgusWarning: argus[4637]: 30 Jan 08 12:32:05.616102     
>>>>>>>>> ArgusGetInterfaceStatus: interface eth2 is up
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.616133  setArgusInterfaceStatus(1)
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.616159  setArgusInterfaceStatus(1)
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:59.824738 ArgusScheduleShutDown(2)
>>>>>>>>>
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:59.824838 ArgusShutDown(Normal   
>>>>>>>>>  Shutdown)
>>>>>>>>>
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:59.824856    
>>>>>>>>> ArgusCloseSource(0x10201f70)  starting
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:59.824874    
>>>>>>>>> ArgusCloseSource(0x10201f70)  deleting source
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:59.825600     
>>>>>>>>> ArgusCloseModeler(0x10181010) pushing close record 0x1028f2b0
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:59.825625    
>>>>>>>>> ArgusCloseModeler(0x10181010)
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:59.825642 ArgusCloseOutput()     
>>>>>>>>> scheduling closure after writing records
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:59.825698  ArgusOutputProcess()  
>>>>>>>>>   received stop record 0 records on the  list
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:59.825716    
>>>>>>>>> ArgusCloseOutput(0x102016b0)  done
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:59.825759 ArgusShutDown()
>>>>>>>>>
>>>>>>>>> # tail /var/log/messages
>>>>>>>>> ...
>>>>>>>>> Jan 30 12:32:05 sniffer1 kernel: RING: succesfully allocated  
>>>>>>>>> 0   KB  [tot_mem=12664896][order=12]
>>>>>>>>> Jan 30 12:32:05 sniffer1 kernel: RING: allocated 10851  slots 
>>>>>>>>>    [slot_len=1546][tot_mem=16777216]
>>>>>>>>> Jan 30 12:32:05 sniffer1 kernel: RING: succesfully allocated  
>>>>>>>>> 0   KB  [tot_mem=12664896][order=12]
>>>>>>>>> Jan 30 12:32:05 sniffer1 kernel: RING: allocated 10851  slots 
>>>>>>>>>    [slot_len=1546][tot_mem=16777216]
>>>>>>>>>
>>>>>>>>> 	and all that shows in syslog is the ring buffer starting as  the
>>>>>>>>> interfaces come up. I think argus used to syslog the   
>>>>>>>>> interfaces   opening to
>>>>>>>>> syslog which may mean there is a bug in syslogging.
>>>>>>>>>
>>>>>>>>> Peter Van Epp / Operations and Technical Support
>>>>>>>>> Simon Fraser University, Burnaby, B.C. Canada
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>>
>





More information about the argus mailing list