Argus on Bivio 7500
Carter Bullard
carter at qosient.com
Mon Aug 24 09:26:55 EDT 2009
Hey Eric,
I like that notion that this problem is not argus related, but if
there is
anything you would like changed to support the effort, give a holler.
Do you have any notion of load on the Bivio when its losing?
Do you have 2/4 processors?
A lot of people are using snort and bro together, it would seem
likely that they would be having the same problems?
Carter
On Aug 21, 2009, at 1:08 PM, Eric Gustafson wrote:
> Hey everyone,
> We've been doing some further experiments with Argus on our Bivios
> in our environment, and have noticed some partial flows, and
> magically disappearing packets when compared to our currently-in-
> production Dell-based sensor.
> We figured out, however, that the problem is not with Argus or Snort
> (the other app running on our Bivios) specifically, but only happens
> when we combine the two.
> We were able to track this down by running full packet captures on a
> box inside our network, a box outside our network, and both sensors,
> while making some transmissions of known size and content.
> A huge percentage of the packets captured by tethereal on the Bivio
> would be missing when both Snort and Argus were running at once,
> with no noticeable pattern. Sounds to me like some sort of race
> condition.
>
> Is anyone else here running Argus with another app on your Bivio,
> and if so, how do you have it configured? Also, which NIMs do you
> have and how much traffic are you processing?
>
> FTR, we're using ours in a monitoring configuration, with a mirror
> port on our main router configured to send a copy of all traffic to
> our 10G NIM. We're seeing not too far over 1.5G of traffic on
> average, but this is growing slowly, and large spikes happen often.
> Our CIG has all traffic load-balanced across all four APC cores. (2-
> tuple vs 5-tuple load balancing seems to make no difference here)
>
> TIA,
> Eric
>
> On Tue, Aug 11, 2009 at 3:38 PM, Peter Van Epp <vanepp at sfu.ca> wrote:
> On Tue, Aug 11, 2009 at 10:36:03AM -0400, Carter Bullard wrote:
> > Hey Peter,
> > Its possible the routine ArgusGetInterfaceStatus(), called from
> > ArgusGetPackets(), around
> > line 2044, which does some interface housekeeping, is broken on the
> > Bivio and maybe closing
> > the pcap interfaces?
> >
> > Carter
> >
>
> Yes that looks like a good possibility. While looking back at
> the
> messages I came across this trace that Jason did. It does look like
> the
> interface may have gotten closed (silently!) at around line 2500. If
> the
> ioctl fails the interface will be quietly closed which would cause the
> problem we are seeing later in the trace (no open interface as found
> never
> gets incremented). I expect Jason needs to trace in to the call at
> line 2019
> and we would probably be wise to add an ARGUSDEBUG statement or two to
> ArgusGetInterfaceStatus such as:
>
> line 2476:
>
> if (device == NULL) {
>
> #ifdef ARGUSDEBUG
> ArgusDebug (4,"ArgusGetInterfaceStatus: device NULL returning\n";)
> #endif
> return;
> }
>
> line 2503:
>
> src->ArgusInterface[i].ArgusPD = NULL;
> #ifdef ARGUSDEBUG
> ArgusDebug (4,"ArgusGetInterfaceStatus: interface %d closed,
> returning\n",i;)
> #endif
>
> I suspect we will find the last one is what is tripping us up
> this
> time. That ioctl is reacting badly on the Bivio perhaps. It would
> also be
> worthwhile doing a tail on /var/log/messages (assuming this is
> Linux) as a
> number of errors syslog but don't ARGUSDEBUG log and one of them may
> provide
> the necessary clue.
>
>
> On Mon, Aug 10, 2009 at 11:17:36AM -0400, Jason Carr wrote:
> > I've attached the gdb output and the configure log. It looks like
> > pcap_get_selectable_fd is not available.
> >
> > The gdb log doesn't seem indicate that it even touched
> pcap_dispatch.
> >
> <snip>
>
> > Starting program: /usr/local/sbin/argus -U 128 -i default -P 563 -
> e 1 -B 0.0.0.0 -D 999
> > Reading symbols from shared object read from target memory...done.
> > Loaded system supplied DSO at 0x100000
> <snip>
>
> As before it seems to open default successfully:
>
> > argus[5858]: 10 Aug 09 11:13:11.782693 ArgusOpenInterface()
> pcap_open_live(default) returned 0x1012f590
> > argus[5858]: 10 Aug 09 11:13:11.782790
> Arguslookup_pcap_callback(1) returning 0x1000e940
> > argus[5858]: 10 Aug 09 11:13:11.782835
> ArgusOpenInterface(0x30070008, 'default') returning
> <snip>
>
> And we start to try and read packets from the interface but
> nowhere
> does found get incremented which will result in the break at line 2239
> to execute and ezit exactly as we are seeing. This in turn could
> well be
> because the pcap interface was quietly closed earlier.
>
> >
> > Breakpoint 1, ArgusGetPackets (src=0x30070008) at ArgusSource.c:2000
> > 2000 if (src != NULL) {
> > (gdb) next
> > 2002 ArgusDebug (4, "ArgusGetPackets (0x%x) starting\n",
> src);
> > (gdb)
> > 2022 for (i = 0; i < src->ArgusInterfaces; i++) {
> > (gdb)
> > 2002 ArgusDebug (4, "ArgusGetPackets (0x%x) starting\n",
> src);
> > (gdb)
> > argus[5858]: 10 Aug 09 11:13:15.638386 ArgusGetPackets
> (0x30070008) starting
> > 2006 fds[i] = -1;
> > (gdb)
> > 2011 signal (SIGPIPE, SIG_IGN);
> > (gdb)
> > 2006 fds[i] = -1;
> > (gdb)
> > 2011 signal (SIGPIPE, SIG_IGN);
> > (gdb)
> > 0x1004fe20 in call___do_global_ctors_aux ()
> > (gdb)
> > Single stepping until exit from function call___do_global_ctors_aux,
> >
> > which has no line number information.
> > 0x0fd6c8d0 in ssignal () from /lib/libc.so.6
> > (gdb)
> > Single stepping until exit from function ssignal,
> > which has no line number information.
> > ArgusGetPackets (src=0x30070008) at ArgusSource.c:2019
> > 2019 ArgusGetInterfaceStatus(src);
> > (gdb)
> > argus[5858]: 10 Aug 09 11:13:17.669976 ArgusPopFrontList
> (0x1012dd98) returning
> > argus[5858]: 10 Aug 09 11:13:17.670051 ArgusPushFrontList
> (0x1012dd68, 0x1012dd98, 1) returning 0xd032
> > 2020 gettimeofday (&src->ArgusStartTime, 0L);
> > (gdb)
> > 0x10050340 in call___do_global_ctors_aux ()
> > (gdb)
> > Single stepping until exit from function call___do_global_ctors_aux,
> > which has no line number information.
> > 0x0fdd0c20 in gettimeofday () from /lib/libc.so.6
> > (gdb)
> > Single stepping until exit from function gettimeofday,
> > which has no line number information.
> > ArgusGetPackets (src=0x30070008) at ArgusSource.c:2022
> > 2022 for (i = 0; i < src->ArgusInterfaces; i++) {
> > (gdb)
> > 2023 if (src->ArgusInterface[i].ArgusPd && (fd =
> pcap_fileno(src->ArgusInterface[i].ArgusPd) >= 0)) {
> > (gdb)
> > 307 src->ArgusInterfaceStatus = value;
> > (gdb)
> > 310 ArgusDebug (1, "setArgusInterfaceStatus(%d)\n", value);
> > (gdb)
> > 2022 for (i = 0; i < src->ArgusInterfaces; i++) {
> > (gdb)
> > 310 ArgusDebug (1, "setArgusInterfaceStatus(%d)\n", value);
> > (gdb)
> > argus[5858]: 10 Aug 09 11:13:40.771988 setArgusInterfaceStatus(0)
> > 2022 for (i = 0; i < src->ArgusInterfaces; i++) {
> > (gdb)
> > 2042 if (!(src->ArgusReadingOffLine)) {
> > (gdb)
> > 2067 for (i = 0; i < src->ArgusInterfaces; i++) {
> > (gdb)
> > 2252 for (i = 0; i < src->ArgusInterfaces; i++) {
> > (gdb)
> > 2067 for (i = 0; i < src->ArgusInterfaces; i++) {
> > (gdb)
> > 2069 src->ArgusThisIndex = i;
> > (gdb)
> > 2070 if ((fd = fds[i]) != -1) {
> > (gdb)
> > 2067 for (i = 0; i < src->ArgusInterfaces; i++) {
> > (gdb)
> > 2093 if (pkts == 0) {
> > (gdb)
> > 2094 struct timespec tsbuf = {0, 25000000},
> *ts = &tsbuf;
> > (gdb)
> > 2095 gettimeofday (&src->ArgusModel-
> >ArgusGlobalTime, NULL);
> > (gdb)
> > 2094 struct timespec tsbuf = {0, 25000000},
> *ts = &tsbuf;
> > (gdb)
> > 2095 gettimeofday (&src->ArgusModel-
> >ArgusGlobalTime, NULL);
> > (gdb)
> > 2094 struct timespec tsbuf = {0, 25000000},
> *ts = &tsbuf;
> > (gdb)
> > 2095 gettimeofday (&src->ArgusModel-
> >ArgusGlobalTime, NULL);
> > (gdb)
> > 0x10050340 in call___do_global_ctors_aux ()
> > (gdb)
> > Single stepping until exit from function call___do_global_ctors_aux,
> > which has no line number information.
> > 0x0fdd0c20 in gettimeofday () from /lib/libc.so.6
> > (gdb)
> > Single stepping until exit from function gettimeofday,
> > which has no line number information.
> > ArgusGetPackets (src=0x30070008) at ArgusSource.c:2096
> > 2096 nanosleep(ts, NULL);
> > (gdb)
> > 0x1004fd50 in call___do_global_ctors_aux ()
> > (gdb)
> > Single stepping until exit from function call___do_global_ctors_aux,
> > which has no line number information.
> > 0x30014890 in _dl_runtime_resolve () from /lib/ld.so.1
> > (gdb)
> > Single stepping until exit from function _dl_runtime_resolve,
> > which has no line number information.
> > 0x0fde0c7c in nanosleep () from /lib/libc.so.6
> > (gdb)
> > Single stepping until exit from function nanosleep,
> > which has no line number information.
> > 0x0fde0c88 in __nanosleep_nocancel () from /lib/libc.so.6
> > (gdb)
> > Single stepping until exit from function __nanosleep_nocancel,
> > which has no line number information.
> > ArgusGetPackets (src=0x30070008) at ArgusSource.c:2238
> > 2238 if (!found)
> > (gdb)
> > 2259 setArgusFarReportInterval (ArgusModel, "0");
> > (gdb)
> > 2260 ArgusQueueManager(ArgusModel);
> > (gdb)
> > argus[5858]: 10 Aug 09 11:13:53.460484
> ArgusProcessQueueTimeout(0x1012d008, 0x1012d3e8) done
> > argus[5858]: 10 Aug 09 11:13:53.460593 ArgusQueueManager() turns
> 1 statusQueue 0 qs 0 items 0 cache 0 resort 0
> reclaim 0 new 0 sends 0 bsends 0
> > 2262 ArgusOutputProcess(ArgusOutputTask);
> > (gdb)
> > argus[5858]: 10 Aug 09 11:13:59.161973
> ArgusOutputProcess(0x1012d9f0) starting
> > argus[5858]: 10 Aug 09 11:13:59.162110
> ArgusOutputStatusTime(0x1012d9f0) done
> > 2266 gettimeofday (&src->ArgusEndTime, 0L);
> > (gdb)
> > 0x10050340 in call___do_global_ctors_aux ()
> > (gdb)
> > Single stepping until exit from function call___do_global_ctors_aux,
> > which has no line number information.
> > 0x0fdd0c20 in gettimeofday () from /lib/libc.so.6
> > (gdb)
> > Single stepping until exit from function gettimeofday,
> > which has no line number information.
> > ArgusGetPackets (src=0x30070008) at ArgusSource.c:2269
> > 2269 ArgusDebug (4, "ArgusGetPackets () returning\n");
> > (gdb)
> > argus[5858]: 10 Aug 09 11:14:02.775186 ArgusGetPackets () returning
> > 2271 }
> > (gdb)
> > main (argc=13, argv=0x7fab6aa4) at argus.c:558
> > 558 ArgusDebug (1, "main() ArgusGetPackets returned:
> shuting down");
>
> Peter Van Epp
>
Carter Bullard
CEO/President
QoSient, LLC
150 E 57th Street Suite 12D
New York, New York 10022
+1 212 588-9133 Phone
+1 212 588-9134 Fax
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20090824/e918579d/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3815 bytes
Desc: not available
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20090824/e918579d/attachment.bin>
More information about the argus
mailing list