argus crash problem
Carter Bullard
carter at qosient.com
Fri Feb 1 15:37:49 EST 2008
Hey Lei,
So I made a change to the GRE header parsing routine that may
help a little bit. It may die a bit faster, so no guarantees, but if
you
could download argus-3.0.0.tar.gz again and give it a try that would
be helpful.
Carter
On Feb 1, 2008, at 2:57 PM, Lei Wei wrote:
> Hi Carter,
>
> I guess you might want me to print out the instruction pointer? If
> so, the result is:
>
> (gdb) x/12x $eip
>
> 0x804e0a1 <ArgusProcessGreHdr+313>: 0x8900b60f
> 0x08e2c1c2 0x40f0458b 0x00b60f66
> 0x804e0b1 <ArgusProcessGreHdr+329>: 0x8966d009
> 0x458bec45 0x02c083f0 0x4588008a
> 0x804e0c1 <ArgusProcessGreHdr+345>: 0xf0458beb
> 0x8a03c083 0xea458800 0x83f0458d
>
> Lei
>
>
>
> Quoting Carter Bullard <carter at qosient.com>:
>
>> Well, all the routines are saying that the packet data the dag
>> card is providing is "out of bounds".
>> Do this:
>>
>> (gdb) x/12x ip
>>
>> And see what that does.
>>
>> Carter
>>
>>
>> On Feb 1, 2008, at 12:01 PM, Lei Wei wrote:
>>
>>> Hi Carter,
>>>
>>> Here's the result:
>>>
>>> (gdb) where
>>> #0 0x0804e0a1 in ArgusProcessGreHdr (model=0x816f000,
>>> ip=0x2ccef828, length=41) at ArgusModeler.c:694
>>> #1 0x0804dee4 in ArgusProcessPacketHdrs (model=0x816f000,
>>> p=0x2ccef828 <Address 0x2ccef828 out of bounds>, length=61,
>>> type=2048)
>>> at ArgusModeler.c:606
>>> #2 0x0804f041 in ArgusProcessPacket (src=0x81b2000,
>>> p=0x2ccef81a <Address 0x2ccef81a out of bounds>, length=75,
>>> tvp=0x82073d0, type=2048) at ArgusModeler.c:1220
>>> #3 0x08056622 in ArgusEtherPacket (user=0x81b2000 "",
>>> h=0x82073d0, p=0x2ccef81a <Address 0x2ccef81a out of bounds>)
>>> at ArgusSource.c:700
>>> #4 0x08058be5 in ArgusGetPackets (src=0x81b2000) at
>>> ArgusSource.c: 1834
>>> #5 0x0804b7b9 in main (argc=5, argv=0xbfbfec68) at argus.c:567
>>>
>>> (gdb) print bp
>>> $1 = 0x2f1f5012 <Address 0x2f1f5012 out of bounds>
>>> (gdb) print grelen
>>> $2 = 35290742
>>> (gdb) print flags
>>> $3 = 25872
>>> (gdb) print model->ArgusSnapLength
>>> $4 = 36
>>> (gdb) print model->ArgusThisLength
>>> $5 = 41
>>>
>>> Please let me know anything else I could help with.
>>>
>>> Thanks.
>>> Lei
>>>
>>>
>>>
>>>
>>> Quoting Carter Bullard <carter at qosient.com>:
>>>
>>>> Hey Lei,
>>>> So, type this in gdb()>
>>>>
>>>> (gdb) where
>>>> (gdb) print bp
>>>> (gdb) print grelen
>>>> (gdb) print flags
>>>> (gdb) print model->ArgusSnapLength
>>>> (gdb) print model->ArgusThisLength
>>>>
>>>> That will help a lot.
>>>> Carter
>>>>
>>>>
>>>> On Jan 31, 2008, at 11:48 PM, Lei Wei wrote:
>>>>
>>>>> Hi Carter,
>>>>>
>>>>> Here's after I run gdb on the core dump file:
>>>>>
>>>>> GNU gdb 6.1.1 [FreeBSD]
>>>>> Copyright 2004 Free Software Foundation, Inc.
>>>>> GDB is free software, covered by the GNU General Public
>>>>> License, and you are
>>>>> welcome to change it and/or distribute copies of it under
>>>>> certain conditions.
>>>>> Type "show copying" to see the conditions.
>>>>> There is absolutely no warranty for GDB. Type "show warranty"
>>>>> for details.
>>>>> This GDB was configured as "i386-marcel-freebsd"...
>>>>> Core was generated by `argus'.
>>>>> Program terminated with signal 11, Segmentation fault.
>>>>> Reading symbols from /usr/lib/libwrap.so.3...done.
>>>>> Loaded symbols for /usr/lib/libwrap.so.3
>>>>> Reading symbols from /lib/libm.so.3...done.
>>>>> Loaded symbols for /lib/libm.so.3
>>>>> Reading symbols from /lib/libc.so.5...done.
>>>>> Loaded symbols for /lib/libc.so.5
>>>>> Reading symbols from /libexec/ld-elf.so.1...done.
>>>>> Loaded symbols for /libexec/ld-elf.so.1
>>>>> #0 0x0804e0a1 in ArgusProcessGreHdr (model=0x816f000,
>>>>> ip=0x2ccef828, length=41) at ArgusModeler.c:694
>>>>> 694 af = EXTRACT_16BITS(bp);
>>>>> (gdb)
>>>>>
>>>>>
>>>>> Thanks.
>>>>> Lei
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Quoting Carter Bullard <carter at qosient.com>:
>>>>>
>>>>>> If you compiled argus with the correct options, you can
>>>>>> run gdb() against the binary and the core file and it will
>>>>>> tell you what the problem was.
>>>>>>
>>>>>> What happens when you run:
>>>>>> gdb argus argus.core
>>>>>>
>>>>>> ?
>>>>>>
>>>>>> Carter
>>>>>>
>>>>>>
>>>>>> On Jan 31, 2008, at 7:47 PM, Lei Wei wrote:
>>>>>>
>>>>>>> Hey Carter,
>>>>>>>
>>>>>>> I ran argus again without the -D option and it finally core
>>>>>>> dumped. The argus.core is about 350mb. Any comments on what
>>>>>>> we could do about this?
>>>>>>>
>>>>>>> Thanks.
>>>>>>> Lei
>>>>>>>
>>>>>>>
>>>>>>> Quoting Carter Bullard <carter at qosient.com>:
>>>>>>>
>>>>>>>> Hey Lei,
>>>>>>>> You should print both spkts and dpkts, so you can see if
>>>>>>>> you're getting full-duplex
>>>>>>>> flow monitoring. Running with debug information is probably
>>>>>>>> not going to
>>>>>>>> shed any light on the problem you ran into. Run without
>>>>>>>> the option to see
>>>>>>>> how argus performs on your system. Depending on the type of
>>>>>>>> dag card, you
>>>>>>>> could saturate the BUS around 2Gbps. If you have the the
>>>>>>>> PCI express cards,
>>>>>>>> you still may not get 10Gbps, depending on the motherboard
>>>>>>>> you are using.
>>>>>>>>
>>>>>>>> Just run without the "-D" option for a while to see if you
>>>>>>>> get the bug again.
>>>>>>>> Also try modifying the flow model using the /etc/argus.conf
>>>>>>>> file to see if
>>>>>>>> you can get the pcu utilization down.
>>>>>>>>
>>>>>>>> Carter
>>>>>>>>
>>>>>>>>
>>>>>>>> On Jan 30, 2008, at 5:25 PM, Lei Wei wrote:
>>>>>>>>
>>>>>>>>> You are right. I took a look at the flow. I could see some
>>>>>>>>> full tcp connections,i.e.from REQ,CON to FIN. But it
>>>>>>>>> seems that many connections have unknown directions like
>>>>>>>>> below. Is that normal?
>>>>>>>>>
>>>>>>>>> 15:58:19.299491 e tcp
>>>>>>>>> 208.107.16.125.2740 <? >
>>>>>>>>> 152.23.105.115.6881 2 121 CON
>>>>>>>>> 15:58:19.299491 e udp 128.109.230.21.10002
>>>>>>>>> - > 233.4.200.18.10002 2 136 INT
>>>>>>>>> 15:58:19.299511 e udp 155.101.127.30.10002
>>>>>>>>> - > 233.4.200.18.10002 3 204 INT
>>>>>>>>> 15:58:19.299539 e s tcp 207.251.97.202.54174
>>>>>>>>> <? > 152.2.2.149.smtp 26 22585 CON
>>>>>>>>> 15:58:19.299551 e udp 90.196.179.159.5331
>>>>>>>>> <- > 152.23.124.201.9965 7 4216 CON
>>>>>>>>> 15:58:19.299616 e udp 152.2.122.3.1179
>>>>>>>>> - > 118.170.8.216.64371 9 563 INT
>>>>>>>>> 15:58:19.299618 e tcp 81.52.130.186.http
>>>>>>>>> <? > 152.23.110.254.2068 10 9275 CON
>>>>>>>>> 15:58:19.299634 e tcp 152.23.211.119.52335
>>>>>>>>> <? > 59.134.50.151.3830 3 1412 CON
>>>>>>>>> 15:58:19.299663 e s tcp 76.182.114.156.2093
>>>>>>>>> <? > 152.2.210.22.rdp 6 558 CON
>>>>>>>>> 15:58:19.299676 e tcp 66.249.93.109.imaps
>>>>>>>>> <? > 152.23.88.57.1055 3 263 CON
>>>>>>>>> 15:58:19.299677 e s tcp 74.125.1.81.http
>>>>>>>>> <? > 152.2.176.13.3282 63 64848 CON
>>>>>>>>> 15:58:19.299690 e tcp 152.2.62.113.here-l
>>>>>>>>> <? > 204.10.66.76.http 12 7402 CON
>>>>>>>>> 15:58:19.299690 e tcp
>>>>>>>>> 90.198.223.64.2684 ? >
>>>>>>>>> 152.23.79.176.16328 2 3028 CON
>>>>>>>>> 15:58:19.299739 e tcp 63.146.183.217.http
>>>>>>>>> <? > 204.85.193.203.16244 43 37784 CON
>>>>>>>>> 15:58:19.299778 e tcp
>>>>>>>>> 64.233.179.91.http ? >
>>>>>>>>> 152.23.93.163.4708 2 2968 CON
>>>>>>>>> 15:58:19.299804 e tcp 152.2.1.95.smtp
>>>>>>>>> <? > 62.43.82.68.50100 6 2096 CON
>>>>>>>>> 15:58:19.299823 e tcp
>>>>>>>>> 81.52.130.152.http ? >
>>>>>>>>> 152.23.69.105.1232 1 60 CON
>>>>>>>>> 15:58:19.299862 e tcp 195.241.129.86.14816
>>>>>>>>> <? > 152.2.32.11.53759 67 25519 CON
>>>>>>>>> 15:58:19.299891 e tcp 81.154.227.107.51157
>>>>>>>>> <? > 152.23.126.198.12215 3 1834 CON
>>>>>>>>> 15:58:19.299892 e d tcp 208.253.91.250.40767
>>>>>>>>> <? > 152.2.63.68.rtsp 17 7464 CON
>>>>>>>>> 15:58:19.299893 e tcp 84.67.187.146.52319
>>>>>>>>> <? > 152.23.63.139.2077 5 3206 CON
>>>>>>>>> 15:58:19.299893 e tcp 124.121.160.139.16800
>>>>>>>>> <? > 152.23.204.249.4173 2 1560 CON
>>>>>>>>> 15:58:19.299928 e udp 152.2.255.242.10000
>>>>>>>>> <- > 76.182.85.17.10000 114 94428 CON
>>>>>>>>> 15:58:19.299938 e tcp 152.23.69.88.3038
>>>>>>>>> <? > 206.33.52.123.http 68 68056 CON
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Quoting Carter Bullard <carter at qosient.com>:
>>>>>>>>>
>>>>>>>>>> One important question is whether the flows look reasonable.
>>>>>>>>>> Full tcp connections, bidirectional etc...
>>>>>>>>>> Carter
>>>>>>>>>>
>>>>>>>>>> On Jan 30, 2008, at 3:57 PM, Lei Wei wrote:
>>>>>>>>>>
>>>>>>>>>>> Argus didn't stop since the file is still growing and
>>>>>>>>>>> the memory usage showing in top is about 493mb. Since
>>>>>>>>>>> you mentioned that argus slows down when it's in
>>>>>>>>>>> debug mode, does it mean that it might drop some
>>>>>>>>>>> packets when running?
>>>>>>>>>>>
>>>>>>>>>>> It's a little bit wierd because when I asked it to
>>>>>>>>>>> output the debug info to the stdout, it outputs many
>>>>>>>>>>> just like you did. But when I tried to redirect it
>>>>>>>>>>> to write into a file using "argus -i dag0 -w data.out
>>>>>>>>>>> -D 2 > error 2>&1 ", it only outputs several lines
>>>>>>>>>>> into the file:
>>>>>>>>>>>
>>>>>>>>>>> argus[574]: 28 Jan 08 01:14:13.680343 ArgusNewModeler()
>>>>>>>>>>> returning 0x816f000
>>>>>>>>>>> argus[574]: 28 Jan 08 01:14:13.686915 ArgusNewSource()
>>>>>>>>>>> returning 0x81b2000
>>>>>>>>>>> argus[574]: 28 Jan 08 01:14:13.686963 ArgusNewOutput()
>>>>>>>>>>> returning retn 0x81b1100
>>>>>>>>>>> argus[574]: 28 Jan 08 01:14:13.708239
>>>>>>>>>>> setArgusID(0x816f000, 0x9802899f) done
>>>>>>>>>>> argus[574]: 28 Jan 08 01:14:13.708311
>>>>>>>>>>> setArgusPortNum(561) returning
>>>>>>>>>>> argus[574]: 28 Jan 08 01:14:13.708512
>>>>>>>>>>> setArgusInterfaceStatus(1)
>>>>>>>>>>>
>>>>>>>>>>> I guess maybe I should run it again and just let it
>>>>>>>>>>> output the debug info on the screen and see what
>>>>>>>>>>> it'll happen.
>>>>>>>>>>>
>>>>>>>>>>> Lei
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Quoting Peter Van Epp <vanepp at sfu.ca>:
>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Jan 30, 2008 at 01:45:17PM -0500, Lei Wei wrote:
>>>>>>>>>>>>> Hello Peter and Carter,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I've been trying to repeat the crash problem again
>>>>>>>>>>>>> recently. I ran
>>>>>>>>>>>>> argus with the debug option enabled from Sunday night
>>>>>>>>>>>>> until now. It
>>>>>>>>>>>>> never crashed. The only debug information it outputed
>>>>>>>>>>>>> was several lines
>>>>>>>>>>>>> when argus started to run and nothing more. The CPU
>>>>>>>>>>>>> usage right now is
>>>>>>>>>>>>> till about 99% but the output data file grows well and
>>>>>>>>>>>>> now is about
>>>>>>>>>>>>> 144G already. I'm not sure why I can't repeat the
>>>>>>>>>>>>> crash again and there
>>>>>>>>>>>>> shouldn't be too much difference in the amount of
>>>>>>>>>>>>> traffic between this
>>>>>>>>>>>>> and last week. By the way, the link I'm monitoring is
>>>>>>>>>>>>> the campus border
>>>>>>>>>>>>> link to the ISP and the traffic load is more than 1G
>>>>>>>>>>>>> per second. I
>>>>>>>>>>>>> wonder if there's any difference running in debug mode
>>>>>>>>>>>>> that could
>>>>>>>>>>>>> possibly play a role in it... Anyway, I'll keep
>>>>>>>>>>>>> monitoring it and see
>>>>>>>>>>>>> what happens.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Lei
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Hmmm, thats too bad. Debug mode does slow argus down a
>>>>>>>>>>>> bit, its
>>>>>>>>>>>> possible that its slowing it down enough that the
>>>>>>>>>>>> problem isn't happening.
>>>>>>>>>>>> What does memory usage look like? I'd be suspicious
>>>>>>>>>>>> that what is happening
>>>>>>>>>>>> is that argus is running out of memory and stopping
>>>>>>>>>>>> (although that should
>>>>>>>>>>>> show up in syslog). That said I just noticed I'm not
>>>>>>>>>>>> getting any syslog
>>>>>>>>>>>> messages from argus (and I think I should be). I just
>>>>>>>>>>>> restarted one of my
>>>>>>>>>>>> test ones with -D2:
>>>>>>>>>>>>
>>>>>>>>>>>> # argus -J -P 560 -i eth2 -i eth3 -m -U700 -D2 --
>>>>>>>>>>>> host 142.58.101.100
>>>>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.555408
>>>>>>>>>>>> ArgusNewModeler() returning 0x10181010
>>>>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.555871
>>>>>>>>>>>> ArgusNewSource() returning 0x10201f70
>>>>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.555903
>>>>>>>>>>>> ArgusNewOutput() returning retn 0x102016b0
>>>>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.555985
>>>>>>>>>>>> setArgusPortNum(560) returning
>>>>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.556023
>>>>>>>>>>>> setArgusInterfaceStatus(1)
>>>>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.585055
>>>>>>>>>>>> ArgusInitSource() pcap_open_live(eth3) returned
>>>>>>>>>>>> 0x10252d60
>>>>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.585119
>>>>>>>>>>>> ArgusOpenInterface(0x10201f70, 'eth3') returning
>>>>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.615008
>>>>>>>>>>>> ArgusInitSource() pcap_open_live(eth2) returned
>>>>>>>>>>>> 0x10253370
>>>>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.615071
>>>>>>>>>>>> ArgusOpenInterface(0x10201f70, 'eth2') returning
>>>>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.615119
>>>>>>>>>>>> ArgusCopyArgv(0xff8811f8) returning 0x10253980
>>>>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.615298
>>>>>>>>>>>> ArgusInitSource() returning
>>>>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.615357
>>>>>>>>>>>> ArgusEstablishListen(560, (null), 0xff880a18)
>>>>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.615526
>>>>>>>>>>>> ArgusEstablishListen(560, 0xff880a18) binding: any:560
>>>>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.615607
>>>>>>>>>>>> ArgusEstablishListen(560, 0xff880a18) returning 5
>>>>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.615633 ArgusInitOutput()
>>>>>>>>>>>> done
>>>>>>>>>>>> ArgusWarning: argus[4637]: 30 Jan 08 12:32:05.615670
>>>>>>>>>>>> started
>>>>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.615711
>>>>>>>>>>>> ArgusInitModeler() done
>>>>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.615974
>>>>>>>>>>>> setArgusInterfaceStatus(1)
>>>>>>>>>>>> ArgusWarning: argus[4637]: 30 Jan 08 12:32:05.616010
>>>>>>>>>>>> ArgusGetInterfaceStatus: interface eth3 is up
>>>>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.616046
>>>>>>>>>>>> setArgusInterfaceStatus(1)
>>>>>>>>>>>> ArgusWarning: argus[4637]: 30 Jan 08 12:32:05.616102
>>>>>>>>>>>> ArgusGetInterfaceStatus: interface eth2 is up
>>>>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.616133
>>>>>>>>>>>> setArgusInterfaceStatus(1)
>>>>>>>>>>>> argus[4637]: 30 Jan 08 12:32:05.616159
>>>>>>>>>>>> setArgusInterfaceStatus(1)
>>>>>>>>>>>> argus[4637]: 30 Jan 08 12:32:59.824738
>>>>>>>>>>>> ArgusScheduleShutDown(2)
>>>>>>>>>>>>
>>>>>>>>>>>> argus[4637]: 30 Jan 08 12:32:59.824838
>>>>>>>>>>>> ArgusShutDown(Normal Shutdown)
>>>>>>>>>>>>
>>>>>>>>>>>> argus[4637]: 30 Jan 08 12:32:59.824856
>>>>>>>>>>>> ArgusCloseSource(0x10201f70) starting
>>>>>>>>>>>> argus[4637]: 30 Jan 08 12:32:59.824874
>>>>>>>>>>>> ArgusCloseSource(0x10201f70) deleting source
>>>>>>>>>>>> argus[4637]: 30 Jan 08 12:32:59.825600
>>>>>>>>>>>> ArgusCloseModeler(0x10181010) pushing close record
>>>>>>>>>>>> 0x1028f2b0
>>>>>>>>>>>> argus[4637]: 30 Jan 08 12:32:59.825625
>>>>>>>>>>>> ArgusCloseModeler(0x10181010)
>>>>>>>>>>>> argus[4637]: 30 Jan 08 12:32:59.825642
>>>>>>>>>>>> ArgusCloseOutput() scheduling closure after writing
>>>>>>>>>>>> records
>>>>>>>>>>>> argus[4637]: 30 Jan 08 12:32:59.825698
>>>>>>>>>>>> ArgusOutputProcess() received stop record 0 records
>>>>>>>>>>>> on the list
>>>>>>>>>>>> argus[4637]: 30 Jan 08 12:32:59.825716
>>>>>>>>>>>> ArgusCloseOutput(0x102016b0) done
>>>>>>>>>>>> argus[4637]: 30 Jan 08 12:32:59.825759 ArgusShutDown()
>>>>>>>>>>>>
>>>>>>>>>>>> # tail /var/log/messages
>>>>>>>>>>>> ...
>>>>>>>>>>>> Jan 30 12:32:05 sniffer1 kernel: RING: succesfully
>>>>>>>>>>>> allocated 0 KB [tot_mem=12664896][order=12]
>>>>>>>>>>>> Jan 30 12:32:05 sniffer1 kernel: RING: allocated 10851
>>>>>>>>>>>> slots [slot_len=1546][tot_mem=16777216]
>>>>>>>>>>>> Jan 30 12:32:05 sniffer1 kernel: RING: succesfully
>>>>>>>>>>>> allocated 0 KB [tot_mem=12664896][order=12]
>>>>>>>>>>>> Jan 30 12:32:05 sniffer1 kernel: RING: allocated 10851
>>>>>>>>>>>> slots [slot_len=1546][tot_mem=16777216]
>>>>>>>>>>>>
>>>>>>>>>>>> and all that shows in syslog is the ring buffer
>>>>>>>>>>>> starting as the
>>>>>>>>>>>> interfaces come up. I think argus used to syslog the
>>>>>>>>>>>> interfaces opening to
>>>>>>>>>>>> syslog which may mean there is a bug in syslogging.
>>>>>>>>>>>>
>>>>>>>>>>>> Peter Van Epp / Operations and Technical Support
>>>>>>>>>>>> Simon Fraser University, Burnaby, B.C. Canada
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>>
>>
>
>
>
More information about the argus
mailing list