[ARGUS] Argus flows to Kafka
Carter Bullard
carter at qosient.com
Wed Sep 11 11:53:21 EDT 2019
Hey Phil,
Yep, not sure why I coupled Kafka with ML so quickly, but that is the context that I’ve seen Kafka in for a while, so I just tagged the buzz words together. I went to the Kafka site, and quickly realized that its just a message fabric, at least the way I think about things … There are a few ZeroMQ transports of argus data out there, Gloriad did that for a while, and they were quite happy, but these groups aren’t contributing the code back to the argus project :o( … I won’t be surprised if there is a Kafka producer and client for argus data somewhere in the universe ...
I am in favor of doing an argus source and client over Kafka, as well as ZeroMQ and others that the community thinks is reasonable to do. We’re going to have a major change in the web site by October (design is already done, just need to move it to Joomla on the new openargus.org <http://openargus.org/> site). I’ve scoped out 4 basic projects that we’re going to do in 2020, kind of a result of the commercial Argus efforts and Argus+Streaming is #3.
If you’re interested, lets see if we can’t get something developed in a reasonable bit of time. Have radium.1 be a Kafka stream processor, since its our argus stream processor. So if you want to write to Kafka, use radium … if you want to read Kafka, use radium and then all the apps that want to get data get if from radium ????
Carter
> On Sep 11, 2019, at 11:37 AM, Phillip Deneault <deneaulp at bc.edu> wrote:
>
> Hi Carter,
>
> Its less about what new is wanted, and more just to optimize how to get what we already have there.
>
> Kafka is simply a message queue system. In principle, Argus or radium could directly send the logs to a Kafka 'topic' (maybe simply using the defined collector id or something) that would be buffered for as long as someone wanted their buffers to be. Then other consumers can come along and process the data. In my use case, it allows me to run local Kafka queues on my sensors which produce events as fast as it can, while letting my consumers (who are far less tolerant to spikes in events) come along and pick up those logs as they can. Additionally, I'd like to avoid writing a XML log to disk, only to have to tail that log in (and all that comes with that), to convert it into JSON for Elastic. If a tool can write directly to a Kafka buffer, in any sort of structured format, Kafka can do all the heavy lifting of managing the records from there and you can leave downstream processing to something else.
>
> And I'm personally intending to use ELK, but there are tons of applications and processors out there that will happily process Kafka queues. I know you have been hesitant in the past to support lots and lots of output formats and types and I would be too, since it would take away from the core flow development, but it would be great for us who like Argus and want to load data from directly into other tools.
>
> Thanks,
> Phil
>
>
>
>
>
> On Tue, Sep 10, 2019 at 11:20 AM Carter Bullard <carter at qosient.com <mailto:carter at qosient.com>> wrote:
> Hey Phil,
>
> There are a few groups that have done specific ML strategies using streaming argus data, Oak Ridge National Labs has an operational system, Situ, and a number of commercial entities have used argus as a part of their ML offerings, but I’m not sure if they are using Kafka … TensorFlow has always been the most common buzz word with these groups … since TensorFlow and Kafka are a common pair of terms, I think some of these companies are probably doing Kafka streaming and Argus data, but not sure that anyone will tell you ...
>
> Is there something that Kafka would need from radium, or the argus-clients that isn’t already there .... Is there a specific thing that Kafka wants in its streaming pipeline ??
>
> Carter
>
> > On Sep 10, 2019, at 8:08 AM, Phillip Deneault <deneaulp at bc.edu <mailto:deneaulp at bc.edu>> wrote:
> >
> > Is there an ra* tool, or is anyone aware of a 3rd party tool, that can process argus output directly into Kafka? Final stop would be an ELK database, but using Kafka would be a better middle ground from a performance and maintenance point of view.
> >
> > Thanks,
> > Phil
> >
> >
> > _______________________________________________
> > argus mailing list
> > argus at qosient.com <mailto:argus at qosient.com>
> > https://pairlist1.pair.net/mailman/listinfo/argus <https://pairlist1.pair.net/mailman/listinfo/argus>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20190911/12f152f1/attachment-0001.html>
More information about the argus
mailing list