ietf
[Top] [All Lists]

Re: Review of draft-hardie-privsec-metadata-insertion-05

2017-02-08 12:45:43

On 7 Feb 2017, at 23:24, Ted Hardie <ted(_dot_)ietf(_at_)gmail(_dot_)com> 
wrote:

On Tue, Feb 7, 2017 at 12:27 PM, Yoav Nir 
<ynir(_dot_)ietf(_at_)gmail(_dot_)com 
<mailto:ynir(_dot_)ietf(_at_)gmail(_dot_)com>> wrote:
Reviewer: Yoav Nir
Review result: Has Nits

Hi

The document is well-written and understandable, but a few things
about it seem wrong:

Section 3 describes data minimization as "one of the core mitigations
for the loss of confidentiality". However, the only example given
where data minimization is used to mitigate confidentiality loss is
when browsers suppress cookies in private mode. The rest of the
examples given (HTTP proxies, recursive DNS, VPN) are such where the
data minimization is incidental to some other function. Nobody
deployed the HTTP proxy or the DNS server in order to enhance
privacy.

So, I would challenge "nobody" on the HTTP proxy front, but that's not the 
important piece here.  The important piece is they generally do have a net 
privacy positive effect.  That privacy positive effect is eliminated when 
metadata that would be concealed by normal operation is added back in.  
That's the converse of your statement:  no one built an HTTP proxy or 
recursive DNS server in order to supply metadata.

Is there a specific place in the doc where more text would make this point 
clearer?

Perhaps an explanation of the phrase “other actors within a protocol” in 
paragraph 2 of section 3. Is it just intermediaries (which according to the 
definition in RFC 6973 are necessary for the protocol) or are there any other 
entities? Is it existing middleboxes that new protocols seed to modify or 
entirely new middleboxes?


The HTTP proxy example in particular is not convincing. HTTP is
designed to work without proxies. Any data minimization provided
incidentally by a proxy is nothing that can be counted on,

I'm not sure what you mean by "counted on" here.  Do you mean that you can't 
know the pool of other users and thus the number of times you'll be served 
from cache rather than getting new data?

Just that I would never write a privacy considerations section that says “The 
source address of the client is obfuscated because NATs and proxies are 
everywhere”. I can’t count on a proxy (or NAT) always being present, so I can’t 
count on them as the solution to the data minimization need. I *can* generally 
assume that clients will use a recursive DNS resolver.

so a
prohibition on restoring said data (especially in the case of a
server-side load balancer) is just not convincing. OTOH in DNS
recursive resolvers that hide the origin IP of the client are the norm
- Authoritative servers hardly ever get to see real addresses of
clients. In that case exposing the real IP address of the client shows
data that was not there before.

I believe the text should differentiate between cases where a network
element is not part of the normal function of the protocol and works
to undo the accidental data minimization that it causes,

So, I think "accidental" is a tricky word here. The use of a proxy or 
recursive resolver is generally speaking not "accidental".  Data minimization 
is a consequence of normal operation, and what we're talking about is 
changing operations in order to change that data minimization.  The core of 
the advice is that instead of changing the operation of the middlebox, change 
the operation of the device about which you desire data.

What would make that clearer?

Yes. Talking about changing the operation of the middlebox assumes that the 
middlebox already existed. So we already had a proxy, and now it’s adding new 
fields. Is the advice just about extending the functionality of existing 
intermediaries?

and cases
where the network element is expected in the protocol and thus the
minimization is expected as well. I think the prescription in the text
applies to the latter. I am not convinced about the former

The VPN example is a strange one. If the subject is a corporate VPN,
then restoring the original IP addresses is the function of the VPN.
If, OTOH, VPN is that service that allows people to watch Hulu outside
of the US, then restoring the IP address would be counter-productive.
It is also strange to see VPN used as an example of "systems whose
primary function is not to provide confidentiality"

Well, it does two things:  shifts the network locality of the endpoint using 
the VPN and masks who is shifting.  I believe the first is the primary 
function and the second a common secondary property.  If you would like 
different language on this, though, please let me know.

To me VPN is mostly about providing confidentiality and data integrity while 
traversing the Internet.  How about

OLD
                            similarly a VPN system used to provide
   channel security may believe that origin IP should be restored.

NEW
                            similarly a VPN system restores all of
   the metadata associated with the IP packet at the tunnel egress.


Yoav

Attachment: signature.asc
Description: Message signed with OpenPGP