argus crash problem
Lei Wei
lwei at cs.unc.edu
Fri Feb 1 12:01:27 EST 2008
Hi Carter,
Here's the result:
(gdb) where
#0 0x0804e0a1 in ArgusProcessGreHdr (model=0x816f000, ip=0x2ccef828,
length=41) at ArgusModeler.c:694
#1 0x0804dee4 in ArgusProcessPacketHdrs (model=0x816f000,
p=0x2ccef828 <Address 0x2ccef828 out of bounds>, length=61, type=2048)
at ArgusModeler.c:606
#2 0x0804f041 in ArgusProcessPacket (src=0x81b2000, p=0x2ccef81a
<Address 0x2ccef81a out of bounds>, length=75, tvp=0x82073d0,
type=2048) at ArgusModeler.c:1220
#3 0x08056622 in ArgusEtherPacket (user=0x81b2000 "", h=0x82073d0,
p=0x2ccef81a <Address 0x2ccef81a out of bounds>) at ArgusSource.c:700
#4 0x08058be5 in ArgusGetPackets (src=0x81b2000) at ArgusSource.c:1834
#5 0x0804b7b9 in main (argc=5, argv=0xbfbfec68) at argus.c:567
(gdb) print bp
$1 = 0x2f1f5012 <Address 0x2f1f5012 out of bounds>
(gdb) print grelen
$2 = 35290742
(gdb) print flags
$3 = 25872
(gdb) print model->ArgusSnapLength
$4 = 36
(gdb) print model->ArgusThisLength
$5 = 41
Please let me know anything else I could help with.
Thanks.
Lei
Quoting Carter Bullard <carter at qosient.com>:
> Hey Lei,
> So, type this in gdb()>
>
> (gdb) where
> (gdb) print bp
> (gdb) print grelen
> (gdb) print flags
> (gdb) print model->ArgusSnapLength
> (gdb) print model->ArgusThisLength
>
> That will help a lot.
> Carter
>
>
> On Jan 31, 2008, at 11:48 PM, Lei Wei wrote:
>
>> Hi Carter,
>>
>> Here's after I run gdb on the core dump file:
>>
>> GNU gdb 6.1.1 [FreeBSD]
>> Copyright 2004 Free Software Foundation, Inc.
>> GDB is free software, covered by the GNU General Public License, and
>> you are
>> welcome to change it and/or distribute copies of it under certain
>> conditions.
>> Type "show copying" to see the conditions.
>> There is absolutely no warranty for GDB. Type "show warranty" for details.
>> This GDB was configured as "i386-marcel-freebsd"...
>> Core was generated by `argus'.
>> Program terminated with signal 11, Segmentation fault.
>> Reading symbols from /usr/lib/libwrap.so.3...done.
>> Loaded symbols for /usr/lib/libwrap.so.3
>> Reading symbols from /lib/libm.so.3...done.
>> Loaded symbols for /lib/libm.so.3
>> Reading symbols from /lib/libc.so.5...done.
>> Loaded symbols for /lib/libc.so.5
>> Reading symbols from /libexec/ld-elf.so.1...done.
>> Loaded symbols for /libexec/ld-elf.so.1
>> #0 0x0804e0a1 in ArgusProcessGreHdr (model=0x816f000,
>> ip=0x2ccef828, length=41) at ArgusModeler.c:694
>> 694 af = EXTRACT_16BITS(bp);
>> (gdb)
>>
>>
>> Thanks.
>> Lei
>>
>>
>>
>>
>>
>> Quoting Carter Bullard <carter at qosient.com>:
>>
>>> If you compiled argus with the correct options, you can
>>> run gdb() against the binary and the core file and it will
>>> tell you what the problem was.
>>>
>>> What happens when you run:
>>> gdb argus argus.core
>>>
>>> ?
>>>
>>> Carter
>>>
>>>
>>> On Jan 31, 2008, at 7:47 PM, Lei Wei wrote:
>>>
>>>> Hey Carter,
>>>>
>>>> I ran argus again without the -D option and it finally core
>>>> dumped. The argus.core is about 350mb. Any comments on what we
>>>> could do about this?
>>>>
>>>> Thanks.
>>>> Lei
>>>>
>>>>
>>>> Quoting Carter Bullard <carter at qosient.com>:
>>>>
>>>>> Hey Lei,
>>>>> You should print both spkts and dpkts, so you can see if you're
>>>>> getting full-duplex
>>>>> flow monitoring. Running with debug information is probably not
>>>>> going to
>>>>> shed any light on the problem you ran into. Run without the
>>>>> option to see
>>>>> how argus performs on your system. Depending on the type of dag
>>>>> card, you
>>>>> could saturate the BUS around 2Gbps. If you have the the PCI
>>>>> express cards,
>>>>> you still may not get 10Gbps, depending on the motherboard you
>>>>> are using.
>>>>>
>>>>> Just run without the "-D" option for a while to see if you get
>>>>> the bug again.
>>>>> Also try modifying the flow model using the /etc/argus.conf file
>>>>> to see if
>>>>> you can get the pcu utilization down.
>>>>>
>>>>> Carter
>>>>>
>>>>>
>>>>> On Jan 30, 2008, at 5:25 PM, Lei Wei wrote:
>>>>>
>>>>>> You are right. I took a look at the flow. I could see some full
>>>>>> tcp connections,i.e.from REQ,CON to FIN. But it seems that
>>>>>> many connections have unknown directions like below. Is that
>>>>>> normal?
>>>>>>
>>>>>> 15:58:19.299491 e tcp 208.107.16.125.2740
>>>>>> <? > 152.23.105.115.6881 2 121 CON
>>>>>> 15:58:19.299491 e udp 128.109.230.21.10002 -
>>>>>> > 233.4.200.18.10002 2 136 INT
>>>>>> 15:58:19.299511 e udp 155.101.127.30.10002 -
>>>>>> > 233.4.200.18.10002 3 204 INT
>>>>>> 15:58:19.299539 e s tcp 207.251.97.202.54174 <?
>>>>>> > 152.2.2.149.smtp 26 22585 CON
>>>>>> 15:58:19.299551 e udp 90.196.179.159.5331 <- >
>>>>>> 152.23.124.201.9965 7 4216 CON
>>>>>> 15:58:19.299616 e udp 152.2.122.3.1179 - >
>>>>>> 118.170.8.216.64371 9 563 INT
>>>>>> 15:58:19.299618 e tcp 81.52.130.186.http <? >
>>>>>> 152.23.110.254.2068 10 9275 CON
>>>>>> 15:58:19.299634 e tcp 152.23.211.119.52335 <? >
>>>>>> 59.134.50.151.3830 3 1412 CON
>>>>>> 15:58:19.299663 e s tcp 76.182.114.156.2093 <?
>>>>>> > 152.2.210.22.rdp 6 558 CON
>>>>>> 15:58:19.299676 e tcp 66.249.93.109.imaps <?
>>>>>> > 152.23.88.57.1055 3 263 CON
>>>>>> 15:58:19.299677 e s tcp 74.125.1.81.http <?
>>>>>> > 152.2.176.13.3282 63 64848 CON
>>>>>> 15:58:19.299690 e tcp 152.2.62.113.here-l <?
>>>>>> > 204.10.66.76.http 12 7402 CON
>>>>>> 15:58:19.299690 e tcp 90.198.223.64.2684 ? >
>>>>>> 152.23.79.176.16328 2 3028 CON
>>>>>> 15:58:19.299739 e tcp 63.146.183.217.http <? >
>>>>>> 204.85.193.203.16244 43 37784 CON
>>>>>> 15:58:19.299778 e tcp 64.233.179.91.http ? >
>>>>>> 152.23.93.163.4708 2 2968 CON
>>>>>> 15:58:19.299804 e tcp 152.2.1.95.smtp <?
>>>>>> > 62.43.82.68.50100 6 2096 CON
>>>>>> 15:58:19.299823 e tcp 81.52.130.152.http ? >
>>>>>> 152.23.69.105.1232 1 60 CON
>>>>>> 15:58:19.299862 e tcp 195.241.129.86.14816 <?
>>>>>> > 152.2.32.11.53759 67 25519 CON
>>>>>> 15:58:19.299891 e tcp 81.154.227.107.51157 <? >
>>>>>> 152.23.126.198.12215 3 1834 CON
>>>>>> 15:58:19.299892 e d tcp 208.253.91.250.40767 <?
>>>>>> > 152.2.63.68.rtsp 17 7464 CON
>>>>>> 15:58:19.299893 e tcp 84.67.187.146.52319 <? >
>>>>>> 152.23.63.139.2077 5 3206 CON
>>>>>> 15:58:19.299893 e tcp 124.121.160.139.16800 <? >
>>>>>> 152.23.204.249.4173 2 1560 CON
>>>>>> 15:58:19.299928 e udp 152.2.255.242.10000 <-
>>>>>> > 76.182.85.17.10000 114 94428 CON
>>>>>> 15:58:19.299938 e tcp 152.23.69.88.3038 <? >
>>>>>> 206.33.52.123.http 68 68056 CON
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Quoting Carter Bullard <carter at qosient.com>:
>>>>>>
>>>>>>> One important question is whether the flows look reasonable.
>>>>>>> Full tcp connections, bidirectional etc...
>>>>>>> Carter
>>>>>>>
>>>>>>> On Jan 30, 2008, at 3:57 PM, Lei Wei wrote:
>>>>>>>
>>>>>>>> Argus didn't stop since the file is still growing and the
>>>>>>>> memory usage showing in top is about 493mb. Since you
>>>>>>>> mentioned that argus slows down when it's in debug mode,
>>>>>>>> does it mean that it might drop some packets when running?
>>>>>>>>
>>>>>>>> It's a little bit wierd because when I asked it to output the
>>>>>>>> debug info to the stdout, it outputs many just like you
>>>>>>>> did. But when I tried to redirect it to write into a file
>>>>>>>> using "argus -i dag0 -w data.out -D 2 > error 2>&1 ", it
>>>>>>>> only outputs several lines into the file:
>>>>>>>>
>>>>>>>> argus[574]: 28 Jan 08 01:14:13.680343 ArgusNewModeler()
>>>>>>>> returning 0x816f000
>>>>>>>> argus[574]: 28 Jan 08 01:14:13.686915 ArgusNewSource()
>>>>>>>> returning 0x81b2000
>>>>>>>> argus[574]: 28 Jan 08 01:14:13.686963 ArgusNewOutput()
>>>>>>>> returning retn 0x81b1100
>>>>>>>> argus[574]: 28 Jan 08 01:14:13.708239 setArgusID(0x816f000,
>>>>>>>> 0x9802899f) done
>>>>>>>> argus[574]: 28 Jan 08 01:14:13.708311 setArgusPortNum(561) returning
>>>>>>>> argus[574]: 28 Jan 08 01:14:13.708512 setArgusInterfaceStatus(1)
>>>>>>>>
>>>>>>>> I guess maybe I should run it again and just let it output
>>>>>>>> the debug info on the screen and see what it'll happen.
>>>>>>>>
>>>>>>>> Lei
>>>>>>>>
>>>>>>>>
>>>>>>>> Quoting Peter Van Epp <vanepp at sfu.ca>:
>>>>>>>>
>>>>>>>>> On Wed, Jan 30, 2008 at 01:45:17PM -0500, Lei Wei wrote:
>>>>>>>>>> Hello Peter and Carter,
>>>>>>>>>>
>>>>>>>>>> I've been trying to repeat the crash problem again recently. I ran
>>>>>>>>>> argus with the debug option enabled from Sunday night until
>>>>>>>>>> now. It
>>>>>>>>>> never crashed. The only debug information it outputed was
>>>>>>>>>> several lines
>>>>>>>>>> when argus started to run and nothing more. The CPU usage
>>>>>>>>>> right now is
>>>>>>>>>> till about 99% but the output data file grows well and now
>>>>>>>>>> is about
>>>>>>>>>> 144G already. I'm not sure why I can't repeat the crash
>>>>>>>>>> again and there
>>>>>>>>>> shouldn't be too much difference in the amount of traffic
>>>>>>>>>> between this
>>>>>>>>>> and last week. By the way, the link I'm monitoring is the
>>>>>>>>>> campus border
>>>>>>>>>> link to the ISP and the traffic load is more than 1G per
>>>>>>>>>> second. I
>>>>>>>>>> wonder if there's any difference running in debug mode that could
>>>>>>>>>> possibly play a role in it... Anyway, I'll keep monitoring
>>>>>>>>>> it and see
>>>>>>>>>> what happens.
>>>>>>>>>>
>>>>>>>>>> Lei
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Hmmm, thats too bad. Debug mode does slow argus down a bit, its
>>>>>>>>> possible that its slowing it down enough that the problem
>>>>>>>>> isn't happening.
>>>>>>>>> What does memory usage look like? I'd be suspicious that
>>>>>>>>> what is happening
>>>>>>>>> is that argus is running out of memory and stopping
>>>>>>>>> (although that should
>>>>>>>>> show up in syslog). That said I just noticed I'm not getting
>>>>>>>>> any syslog
>>>>>>>>> messages from argus (and I think I should be). I just
>>>>>>>>> restarted one of my
>>>>>>>>> test ones with -D2:
>>>>>>>>>
>>>>>>>>> # argus -J -P 560 -i eth2 -i eth3 -m -U700 -D2 -- host
>>>>>>>>> 142.58.101.100
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.555408 ArgusNewModeler()
>>>>>>>>> returning 0x10181010
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.555871 ArgusNewSource()
>>>>>>>>> returning 0x10201f70
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.555903 ArgusNewOutput()
>>>>>>>>> returning retn 0x102016b0
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.555985 setArgusPortNum(560)
>>>>>>>>> returning
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.556023 setArgusInterfaceStatus(1)
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.585055 ArgusInitSource()
>>>>>>>>> pcap_open_live(eth3) returned 0x10252d60
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.585119
>>>>>>>>> ArgusOpenInterface(0x10201f70, 'eth3') returning
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.615008 ArgusInitSource()
>>>>>>>>> pcap_open_live(eth2) returned 0x10253370
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.615071
>>>>>>>>> ArgusOpenInterface(0x10201f70, 'eth2') returning
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.615119
>>>>>>>>> ArgusCopyArgv(0xff8811f8) returning 0x10253980
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.615298 ArgusInitSource() returning
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.615357
>>>>>>>>> ArgusEstablishListen(560, (null), 0xff880a18)
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.615526
>>>>>>>>> ArgusEstablishListen(560, 0xff880a18) binding: any:560
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.615607
>>>>>>>>> ArgusEstablishListen(560, 0xff880a18) returning 5
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.615633 ArgusInitOutput() done
>>>>>>>>> ArgusWarning: argus[4637]: 30 Jan 08 12:32:05.615670 started
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.615711 ArgusInitModeler() done
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.615974 setArgusInterfaceStatus(1)
>>>>>>>>> ArgusWarning: argus[4637]: 30 Jan 08 12:32:05.616010
>>>>>>>>> ArgusGetInterfaceStatus: interface eth3 is up
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.616046 setArgusInterfaceStatus(1)
>>>>>>>>> ArgusWarning: argus[4637]: 30 Jan 08 12:32:05.616102
>>>>>>>>> ArgusGetInterfaceStatus: interface eth2 is up
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.616133 setArgusInterfaceStatus(1)
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.616159 setArgusInterfaceStatus(1)
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:59.824738 ArgusScheduleShutDown(2)
>>>>>>>>>
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:59.824838 ArgusShutDown(Normal
>>>>>>>>> Shutdown)
>>>>>>>>>
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:59.824856
>>>>>>>>> ArgusCloseSource(0x10201f70) starting
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:59.824874
>>>>>>>>> ArgusCloseSource(0x10201f70) deleting source
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:59.825600
>>>>>>>>> ArgusCloseModeler(0x10181010) pushing close record 0x1028f2b0
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:59.825625
>>>>>>>>> ArgusCloseModeler(0x10181010)
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:59.825642 ArgusCloseOutput()
>>>>>>>>> scheduling closure after writing records
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:59.825698 ArgusOutputProcess()
>>>>>>>>> received stop record 0 records on the list
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:59.825716
>>>>>>>>> ArgusCloseOutput(0x102016b0) done
>>>>>>>>> argus[4637]: 30 Jan 08 12:32:59.825759 ArgusShutDown()
>>>>>>>>>
>>>>>>>>> # tail /var/log/messages
>>>>>>>>> ...
>>>>>>>>> Jan 30 12:32:05 sniffer1 kernel: RING: succesfully allocated
>>>>>>>>> 0 KB [tot_mem=12664896][order=12]
>>>>>>>>> Jan 30 12:32:05 sniffer1 kernel: RING: allocated 10851 slots
>>>>>>>>> [slot_len=1546][tot_mem=16777216]
>>>>>>>>> Jan 30 12:32:05 sniffer1 kernel: RING: succesfully allocated
>>>>>>>>> 0 KB [tot_mem=12664896][order=12]
>>>>>>>>> Jan 30 12:32:05 sniffer1 kernel: RING: allocated 10851 slots
>>>>>>>>> [slot_len=1546][tot_mem=16777216]
>>>>>>>>>
>>>>>>>>> and all that shows in syslog is the ring buffer starting as the
>>>>>>>>> interfaces come up. I think argus used to syslog the
>>>>>>>>> interfaces opening to
>>>>>>>>> syslog which may mean there is a bug in syslogging.
>>>>>>>>>
>>>>>>>>> Peter Van Epp / Operations and Technical Support
>>>>>>>>> Simon Fraser University, Burnaby, B.C. Canada
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>>
>
More information about the argus
mailing list