organizing large datasets that reference look up info from outside sources

Carter Bullard carter at qosient.com
Fri Oct 20 13:05:11 EDT 2017


Hey Mike,
Much of what you are interested in is supported here and there by the open source tools,  and is also at the core of the commercial argus products that we have been developing.   On the open source side, you should checkout racluster.1 and rasqlinsert.1 as they are the primary examples of how to convert streaming flow data into flow-based summaries to fill up database schemas that you can use to track your individual ethernet addresses, IP addresses, as well as track the matrix data of who is talking to who.   In the “Using Argus” section of the argus web site, there are descriptions of many of the parts needed to approach your problem set.

Examples of how to use rasqlinsert() are in the http://qosient.com/argus/database.shtml <http://qosient.com/argus/database.shtml> link.  Think of rasqlinsert.1 as equivalent to raclutser.1 and ratop.1, but instead of writing to a file or the screen, rasqlinsert.1 writes its output to mysql database tables.  Many sites have dozens of rasqlinsert.1.s that run periodically to generate views into their data, that they keep around (say for a year) and query when needed.  

If you wanted to know the first occurrence of a specific external IP address in your enterprise, if you had a database table of all the IP addresses seen in a day, you could query all your daily tables to find the first occurrence.  If you do this every day, you can answer if you saw a specific remote network today, or last week.  If your tables are well structured, you can know who was talking to that address.

In the examples pages, we have the specific rasqlinsert.1 commands to generate an IP inventory database that can answer at least one of the questions you have.  Our experience is that you can build a handful of these fundamental information systems and answer most of the questions you list in your email.  Handling the dynamic issues is a tough one, although we have commercial systems that do a good job at that.

This is, of course, a very complex thing … everyone has different needs, questions, concerns.  If you like, take a look at the products link on the qosient / argus web site (http://qosient.com/tech.htm <http://qosient.com/tech.htm>) , that is a flavor of what we are doing in the commercial area, and if you have an interest, give us a holler.

Hope all is most excellent,
Carter


> On Oct 19, 2017, at 3:42 PM, mike tancsa <mike at sentex.ca> wrote:
> 
> Hi folks,
> 	I have a lot of network flows that I need to answer questions about
> that are not easily done with a simple argus command and am looking for
> suggestions on how best to organize the data.
> 
> 
> I have 2 large classes of endpoints -- servers and support staff.  The
> servers are further broken down by department.
> 
> to complicate matters, the servers are all on dynamic IPs and some come
> and go a few times a day and will get different IP addrs as do the
> support staff.
> 
> I get general questions like
> 
> Did anyone access server x,y and z in the last 3 days via netops and when ?
> 
> or
> 
> Who has accessed all of the Accounting Department's servers in Toronto
> yesterday?
> 
> I can easily tell by port (6502) if there was remote access to a server.
> But its the matching of "server" to sets of IPs by time as well as
> matching the user's IP address to the users x509 token name.
> 
> Here are some of the challenges I have
> 
> Did anyone access server x,y and z in the last 3 days ?
> I first need to figure out the IPs and their start and stop times for
> those IPs the servers were on so I can see that server 'x' was IP
> address 192.168.77.3 from Sep 1 3pm to 9pm and then IP address 10.2.3.4
> from Sep 1 10pm to Sep 12 at 1pm and so on.  I could have a few dozen
> IPs that the server would have in the time period I am interested in.
> This is from a mysql database of start and stop records.
> 
> 
> I can do a lot of manual steps to answer these questions, but manual
> steps take time and are prone to error. I havent had to do this very
> often in the past, but am seeing it more and more so I am looking for
> strategies to better prep the data ahead of time to make my life easier
> and answer these questions quicker.
> 
> Any suggestions / pointers would be most welcome. I am thinking I want
> to look into using ralabel ?
> 
> Perhaps as part of a daily script,
> 
> * identify the IP(s) of the server for that day and their times.
> * Query out all their flows and label them with the server:department
> and drop them into new argus files based on the server:department
> 
> 
> Is there a better way to approach this ?
> 
> 	---Mike
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20171020/2f861f34/attachment.html>


More information about the argus mailing list