Ranonymize a subset of IPs

Carter Bullard carter at qosient.com
Fri Jul 15 15:15:48 EDT 2011


Hey Huy,
Need to keep this thread on the mailing list.

If you could take a look at the current configuration of ranonymize(), this lists all the features that we
currently support.   The issue is that we don't have, currently, the concept that half of the flow records
information has been modified and the other half has not been changed.  This type of incomplete
anonymization makes the classic case of reverse engineering of the anonymization maps pretty trivial.

So you anonymize your addresses, keeping the external addresses intact, for example, and
release the data.  Flow monitors in the external addresses network will have their collected flow
information, which can be easily correlated to your data (byte count, packet count, their IP address),
and now they know the translation of at least one of your addresses.

But, I'm not worried about this, as much as I'm interested in logic and how to approach it.
It is possible that we can translate a subset of the IP addresses, even providing hierarchy
preservation, etc...., without any coupling to Layer 2 anonymization, or port anonymization, etc.....

Carter


On Jul 15, 2011, at 2:32 PM, Huy N. Hang wrote:

> Hi Carter!
> 
> When I made the request, our group at the University of California at Riverside are working on a traffic collection project where we do just that. We, however, would like to release the traffic collection to the public for research purposes. We don't like the idea of anonymizing everything, so we only like to do so for the information of the hosts we are collecting from to protect their identities. That is why we only wish to anonymize a range of IP (that encompasses our school's prefix) and leave everything else intact. I want to do this because I'd like to see where a person's traffic is going, but I don't want to know who that person is (even the payload has been removed).
> 
> To answer your question then, we'd be very happy if the new feature could let us pick and explicitly choose which attributes of the hosts we would like to anonymize (taking in a configuration file to do this would be awesome) and leave the other ones untouched. THis would give us enough freedom to pick and choose so that we can release the most information without compromising our monitored hosts' privacy.
> 
> And yes, we would like to preserve everything if we want to preserve a range of IP as well.
> 
> Have I answered your questions?
> 
> Please tell me if you need me to clarify :)
> 
> Thanks!
> 
> 
> On 07/15/2011 10:57 AM, Carter Bullard wrote:
>> Hey Huy,
>> I'm starting to consider the implementation of this feature, and its a little complicated, so I
>> need to talk about it a bit.
>> 
>> You have asked that we consider partial stream anonymization, where we would anonymize
>> some IP addresses and not others.  There are a few "gotchas" to be considered here, the
>> worst one is where a flow needs one of its IP addresses to be anonymized, but the other
>> IP is not going to be anonymized.  This is a little of a brain teaser, but I do like the idea.
>> 
>> There are a lot of fields to consider when you anonymize a record.  Time, all network identifiers,
>> such as ethernet addresses, fragmentation identifiers, TTLs, DSByte encodings, transport
>> identifiers, like port numbers, sequence numbers, etc....  Many of these attributes are attributes
>> of the host.  So if I preserve a particular IP address, should I preserve all the other host attributes
>> that apply to that IP address?  When you think about this, it gets interesting, so what were you
>> really wanting in your request?
>> 
>> Should I assume that the decision to anonymize anything in a flow record is based on
>> whether I anonymize one of the IP addresses in the flow?
>> 
>> I'll anonymize time regardless of whether the IP address is going to be anonymized or not?
>> 
>> What do you think?
>> 
>> Carter
>> 
>> On Jul 3, 2011, at 12:33 PM, Huy N. Hang wrote:
>> 
>>> Hey Carter,
>>> 
>>> That would be awesome! :D
>>> 
>>>> Hey Huy,
>>>> I'll have to add two directives, I think to make this convenient.  1) a
>>>> RANON_PRESERVE_ADDRESS_RANGE directive and 2) a
>>>> RANON_SPECIFY_ADDRESS_RANGE to override that address range.  This would
>>>> allow you to anonymize  select IP address
>>>> ranges. That may take a bit of time, but I'll check it out this week.
>>>> 
>>>> Cater
>>>> 
>>>> On Jul 2, 2011, at 8:40 PM, Huy N. Hang wrote:
>>>> 
>>>>> Hi Carter and other gentlefolks,
>>>>> 
>>>>> I've been tinkering with Ranonymize to explore its options. I've been
>>>>> getting it to work on most of what I want, so I'm glad, but I have a
>>>>> quick
>>>>> question:
>>>>> 
>>>>> Can I force ranonymize to anonymize only a subset of IPs? Namely, can I
>>>>> provide a list of IPs that I wish to anonymize and leave all other IPs
>>>>> intact?
>>>>> 
>>>>> Thanks!
>>>>> 
>>>>> 
>>>> 
>>> 
>>> ==================================================
>>> I swear to all that is holy that one day,
>>> I shall use Elvish and/or Klingon alphabets
>>> to name the variables in my research papers!
>>> Revenge can never be more elegant or sweet!
>>> ==================================================
>>> Huy N. Hang, Ph.D. student,
>>> Department of Computer Science and Engineering.
>>> U.C. Riverside
>>> ==================================================
>>> 
>>> 
> 
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4367 bytes
Desc: not available
URL: <https://pairlist1.pair.net/pipermail/argus/attachments/20110715/255d2025/attachment.bin>


More information about the argus mailing list