Anonymization of argus flow data

Kaustubh Gadkari kaustubh.gadkari at gmail.com
Tue Sep 3 09:56:19 EDT 2013


On Tue, Sep 3, 2013 at 7:19 AM, Kaustubh Gadkari
<kaustubh.gadkari at gmail.com> wrote:
> On Tue, Sep 3, 2013 at 6:00 AM, Carter Bullard <carter at qosient.com> wrote:
>> Hmmmm,
>> There shouldn't be any performance issues with anonymizing a file, if your
>> just
>> anonymizing the IP addresses.  How many addresses are in the file?
>> What does your ranonymize.conf file look like?   How much memory is it
>> using?
>>
>
> I am not quite sure how many IP addresses there are in the file. My
> ranonymize.conf looks like this:
>
> RANON_PRESERVE_ETHERNET_VENDOR=yes
> RANON_PRESERVE_BROADCAST_ADDRESS=yes
> RANON_NET_ANONYMIZATION=sequential
> RANON_HOST_ANONYMIZATION=sequential
> RANON_PRESERVE_NET_ADDRESS_HIERARCHY=class
>
> I took a look at how much memory ranonymize is using .. the usage is
> about 42% on a machine with 32GB RAM.
>
>> ranonymize() can be a little complex O(nLogN + C), but it should be
>> in the same time frame as racount().  How long does it take for racount()
>> to read the file?
>>
>
> I am running racount right now .. I will post results once it finishes.

racount takes about 18min to run on the file:

real    17m58.528s
user    17m12.413s
sys     2m0.332s

Kaustubh

>> Just a rule of thumb. If a ra* program doesn't complete in a few minutes,
>> you
>> should stop it and try to figure out if there is a memory problem or not.
>>
>
> Thanks, I'll keep this in mind :)
>
> Thanks,
> Kaustubh
>
>> Carter
>>
>> On Sep 2, 2013, at 2:20 PM, Kaustubh Gadkari <kaustubh.gadkari at gmail.com>
>> wrote:
>>
>> Hi,
>>
>> I have a set of argus flow data captured at our data capture vantage point,
>> and I want to anonymize the IP addresses (both source and destination) fully
>> i.e. I want to replace both the addresses, using a prefix preserving
>> technique. I have tried using ranonymize, but it is taking an extremely long
>> time to anonymize the file (I started the process a couple of months ago, on
>> a ~125GB file, and the output file size today is only ~30GB).
>>
>> Can anyone suggest the right way to go about anonymizing the data set I
>> have? Is ranonymize the right tool for the job?
>>
>> Thanks,
>> Kaustubh
>>
>> --
>> Kaustubh Gadkari
>>
>>
>
>
>
> --
> Kaustubh Gadkari



-- 
Kaustubh Gadkari



More information about the argus mailing list