RE: Last Call: <draft-hardie-privsec-metadata-insertion-05.txt> (Design

Hi Ted,

Please see inline.

Cheers,
Med

De : Ted Hardie [mailto:ted(_dot_)ietf(_at_)gmail(_dot_)com]
Envoyé : samedi 25 février 2017 01:26
À : BOUCADAIR Mohamed IMT/OLN
Cc : ietf(_at_)ietf(_dot_)org; 
draft-hardie-privsec-metadata-insertion(_at_)ietf(_dot_)org
Objet : Re: Last Call: <draft-hardie-privsec-metadata-insertion-05.txt> (Design 
considerations for Metadata Insertion) to Informational RFC

Hi Mohamed,
Some replies in-line.

On Fri, Feb 24, 2017 at 12:55 AM, 
<mohamed(_dot_)boucadair(_at_)orange(_dot_)com<mailto:mohamed(_dot_)boucadair(_at_)orange(_dot_)com>>
 wrote:
Hi Ted,

Please see inline.

Cheers,
Med

De : Ted Hardie 
[mailto:ted(_dot_)ietf(_at_)gmail(_dot_)com<mailto:ted(_dot_)ietf(_at_)gmail(_dot_)com>]
Envoyé : vendredi 24 février 2017 00:40
À : BOUCADAIR Mohamed IMT/OLN
Cc : ietf(_at_)ietf(_dot_)org<mailto:ietf(_at_)ietf(_dot_)org>; 
draft-hardie-privsec-metadata-insertion(_at_)ietf(_dot_)org<mailto:draft-hardie-privsec-metadata-insertion(_at_)ietf(_dot_)org>
Objet : Re: Last Call: <draft-hardie-privsec-metadata-insertion-05.txt> (Design 
considerations for Metadata Insertion) to Informational RFC

HI Mohamad,

Thanks for rechecking; some further comments in-line.

On Wed, Feb 22, 2017 at 11:21 PM, 
<mohamed(_dot_)boucadair(_at_)orange(_dot_)com<mailto:mohamed(_dot_)boucadair(_at_)orange(_dot_)com>>
 wrote:
Hi Ted,

Thank you for the reply and for implementing these changes.

I checked the diff, but I’m afraid the -06 version has the same issues as the 
ones I reported in January 31.


I did respond to the particular comments and text proposals, so I assume this 
is the more general issue.  If I understand correctly, you would prefer this 
document to be structured as a revision to the threat model document or 
connected to a larger consideration of the issues.  I understand that, and it 
was considered, but I believe that this format is still the most effective for 
the narrow issue it addresses.
[Med] That was one of my concerns; not the only one. In Particular, I’m trying 
to understand how this document will be used in the future to better strengthen 
forthcoming specifications. Further, the experience I had when advancing some 
RFCs is that RFC7258 wording can be further enhanced to provide clear guidance. 
Also, considerations such as the following are missing from the document:
It's difficult to say how something will be used in the future.
[Med] An advice that is not implementable makes more troubles, IMHO.
My intent (and the understanding of other reviewers) is to highlight that these 
mechanisms have a privacy-damaging result and that this should be considered.
[Med] I do think existing documents already make that job. I do think we need 
more.
 In particularly, I'm concerned that some application functions in the network 
(e.g. recursive resolvers or proxies) do not consider the postive privacy 
implications of their aggregation and so do not consider adding this data back 
as problematic.
[Med] I’m also concerned with that, too (see e.g., 
http://www1.icsi.berkeley.edu/~narseo/papers/hotm42-vallinarodriguez.pdf). In 
the meantime, I’m also concerned with (1) some applications that leak privacy 
information without the consent of the user and (2) some application servers 
that may correlate various information shared by an application client to track 
users (e.g., https://panopticlick.eff.org/). BTW, I see that you are using 
“application function” which may not have the same meaning as the general 
“protocol” wording used in draft-hardie-*. Do you consider a DHCP relay as an 
“application function”?
   Highlighting this enables them to see this traffic in a different context.
[Med] Isn’t this already assumed by some protocol designers (e.g., RFC6973, 
SIP)? BTW, there are subtleties when proxies are in the same trust domain of 
the client or server.

* that data may not be always available to the endhost
Understood, but even in this case, it is better to make the permission to add 
the data explicit.
[Med] This may be easy to implement for some applications, but this may not be 
generalized to ** all ** protocols. Putting aside the interaction with a user 
to get a consent and how that consent will need to be changed when another user 
uses the same device to connect to the Internet. Consider a user who does not 
want an upstream DHPC relay to insert the line-id 
(https://tools.ietf.org/html/rfc6788) to the server, and let’s suppose the 
relay received a signal (by some means, to be yet specified) that for this 
particular DHCP client, the line-id must not be inserted. For this case, 
connectivity won’t be provided to that user. This would mean extra calls to the 
hotline for that network provider. This is not desirable for both customers and 
network providers.

Another example is address-sharing 
(https://tools.ietf.org/html/rfc6269#section-13.1), a misbehaving user may 
impact all the users under the same address. I bet such user won’t give his/her 
permission so that means to demux its packets.

It is also often possible for the end system to gain this data at the cost of 
other traffic(e.g. the STUN server requests noted in the document).
[Med] I’m not sure whether this is ‘often’, but for sure this is possible for 
some cases. Extra requests may be needed, Indeed. Better, the information may 
even be even available locally, for example there is even no need to run 
STUN/PCP to provide an XFF or Forward-For headers.

If this can be done in parallel with other actions, then the latency impact can 
be minimized.
[Med] These are assumptions and implications that are worth to be added to the 
draft. BTW, this falls into this general discussion in 
https://tools.ietf.org/html/rfc6973:

   a.  Trade-offs.  Does the protocol make trade-offs between privacy
       and usability, privacy and efficiency, privacy and
       implementability, or privacy and other design goals?  Describe
       the trade-offs and the rationale for the design chosen.
* a misbehaving node may be tempted to spoof the data to be injected. A remote 
device that will use that data to enforce policies will be broken.
This point was discussed extensively in the GEOPRIV work and essentially a 
single carve-out was made:  for emergency services, where falsely asserted 
location data could be used to SWAT individuals or consume safety resources.    
I don't think that falls into this narrow advice, but I would be willing to add 
something like this to the security considerations:
"Note that some emergency service recipients, notably PSAPs (Public Safety 
Answering Points) may prefer data provided by a network to data provided by end 
system, because an end system could use false data to attack others or consume 
resources.   While this has the consequence that the data available to the PSAP 
is often more coarse than that available to the end system, the risk of false 
data being provided involved a risk to the lives of those targeted."
[Med] Thank you. Providing PSAP as an example is OK, but I’d like the issue to 
be called out as a generic one while PSAP is provided as an example. What about 
the following:

"Note that some servers (e.g., emergency service recipients, notably PSAPs 
(Public Safety Answering Points) [RFC6443]) may prefer data provided by a 
network to data provided by the end system, because an end system could use 
false data to attack others or consume resources.  While this has the 
consequence that the data available to the server is often more coarse than 
that available to the end system, the risk of false data being provided 
involved a risk to the lives of those targeted."

* it was reported in the past that some browsers leak the MSISDN and other 
sensitive data.
This is true, but it seems to me unrelated to the point of the document.
[Med] It is related because blindly trusting an application client (and server) 
has its own privacy risks. This is even exacerbated given the rich data that is 
available to an application client and also because of the visibility on 
various layers available to an application server.
From that flow some of your other concerns about audience, at least as I 
understand.  As written, this is narrow advice for a broad audience: basically, 
anyone who would consider the form of metadata insertion it describes.  You 
would, if I understand you, prefer a narrower description of the audience in a 
larger context.

[Med] The key point here is about the practicality of implementing the advice 
NOT changing the scope. For example, the document says that it is better that a 
host is injecting the data but the document does not question whether that 
supplied data can be trusted or not, or how the consent will be obtained from a 
user.

In general, the point of the document is that the host should be able to omit 
the data without mid-network devices adding it back.  That's the point of 
protecting the traffic in the first place, after all.  I am saying that if the 
protocols require the data, then getting it from the end host has better 
privacy properties than getting from it from mid-network entities.
[Med] I’m not sure we can have such general statement because the data may not 
be available (e.g., DHCP for example) to clients + the data supplied by clients 
(when possible) may not be reliable + enforcing policies based on 
client-supplied data may have implication on other users (e.g., spoofing XFF 
for example). Obviously, getting some of the information from a client may have 
implications on QoE…the user needs to understand the root causes of a 
degradation of QoE. Of course, these implications may not be new for users who 
are familiar with disabling Java scripts and cie.

For example, the document states that the information in a Forward-For header 
can be supplied by the host itself and then communicated to a remote consumer. 
This is indeed possible, but because of abusing hosts some servers implement 
whitelists to trust proxies; see 
https://meta.wikimedia.org/w/extensions/TrustedXFF/trusted-hosts.txt.


The Wikimedia case is a very interesting one to raise, because it derives from 
a set of assumptions about the network that are somewhat flawed and then 
attempts to patch those flaws in ways that actually damage the mechanisms of 
the system they originally built.
Wikimedia wants to allow folks to edit without login credentials.  This allows 
for anonymous users to make corrections or additions; this is a goal.  The 
consequence of that goal being achieved is that trolls or malicious editors can 
have at anything they want.
Rather than institute credentials and ACLs, Wikimedia attempts to substitute 
blocking by IP for blocking by credential.  The property they are looking for 
in IPs is not really there, though:  they are not unique to individuals, 
especially over time.

This damages those who share IP addresses (due to NATs or proxies).  As far as 
I can tell, the NAT problem is simply treated as collateral damage.  For the 
proxies, they attempt to work around the damage using XFF.  That's spoofable, 
though, so they attempt to limit it to specific proxies whose XFF they 
trust--many of which require logins.  That shifts the information about who is 
editing Wikipedia out of their hands, but leaves it in the network and thus not 
truly anonymous.  I understand the engineering balance they are trying to 
strike, but I'm not sure I can recommend their solution.

[Med] I’m not recommending their solution either, but I’m trying to raise the 
point that an engineering balance is out there. ACKing that deployment reality 
is better than ignoring it.

I’m reiterating that most of my comments are still unaddressed in -06.


I realize that the document did not change to address the audience or document 
integration you preferred; I think there we simply disagree on how to make this 
advice effective.  I'm sorry that first message apparently did not describe the 
disagreement effectively.
If I have misunderstood your comments, please accept my apologies. I would be 
happy of further clarification and suggested text to illustrate your 
preferences would be especially welcome.
[Med] Sure. Before that, can you please consider rechecking the detailed list 
of comments available at: 
https://www.ietf.org/mail-archive/web/ietf/current/msg101629.html. Thank you.

Thanks again for your engagement on this,
regards,
Ted


thanks,
Ted Hardie


Cheers,
Med

De : Ted Hardie 
[mailto:ted(_dot_)ietf(_at_)gmail(_dot_)co<mailto:ted(_dot_)ietf(_at_)gmail(_dot_)co>m]
Envoyé : mercredi 22 février 2017 23:09
À : BOUCADAIR Mohamed IMT/OLN
Cc : ietf(_at_)ietf(_dot_)org<mailto:ietf(_at_)ietf(_dot_)org>; 
draft-hardie-privsec-metadata-insertion(_at_)ietf(_dot_)org<mailto:draft-hardie-privsec-metadata-insertion(_at_)ietf(_dot_)org>
Objet : Re: Last Call: <draft-hardie-privsec-metadata-insertion-05.txt> (Design 
considerations for Metadata Insertion) to Informational RFC

Hi Mohamed,
Thanks for your review.  I've uploaded a draft -06 with updates from your and 
other reviews.  Some notes in-line.

On Tue, Jan 31, 2017 at 1:49 AM, 
<mohamed(_dot_)boucadair(_at_)orange(_dot_)com<mailto:mohamed(_dot_)boucadair(_at_)orange(_dot_)com>>
 wrote:
Dear Ted,

Please find below my general review of the document and also my detailed 
comments.

* Overall:
- I don't think the document is ready to be published as it is. It does not 
discuss the usability and implications of the advice. Further, it may be 
interpreted that a client/end system/user can always by itself populate data 
that is supplied by on-path nodes (in current deployments). That's assumption 
is not true for some protocols.
- The purpose of publishing this advice is not clear. For example, how this 
advice will be implemented in practice? What is its scope?
- I would personally prefer an updated version of RFC7258 with more strict 
language on the privacy-related considerations. This is more actionable with 
concrete effects in documents that will required to include a discussion on 
privacy related matters.

Detailed comments are provided below:

* The abstract says the following:

   The IAB has published [RFC7624] in response to several revelations of
   pervasive attack on Internet communications.  This document considers
   the implications of protocol designs which associate metadata with
   encrypted flows.  In particular, it asserts that designs which do so
   by explicit actions of the end system are preferable to designs in
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   which middleboxes insert them.

I suggest you explicit what is meant by "the end system".

I have updated this to clarify that this is the host/end system not the user.

If you mean the owner/user, then the text should say so. If you mean a client 
software instance, then bugs/inappropriate default values may lead to (privacy 
leak) surprises too. It was reported in the past that some browsers inject the 
MSISDN too.

* Introduction: "To ensure that the Internet can be trusted by users"

Rather « To minimize the risk of Internet-originated attacks targeted at users 
».

I've adopted this language.

It's reasonable to claim the Internet can be trusted by users; see how the 
usage of social networks has become severely twisted for example

I've also considered your point that an updated version of RFC7258 might be a 
better outlet for advice like this. We did consider several approaches, 
including incorporating the text in an update to  RFC 3552 or as part of a 
document describing the full set of companion mitigations to the threats in RFC 
7624 (draft-iab-privsec-confidentiality-mitigations would be one approach).  
Those are all valid approaches, but it seemed that short, easily read documents 
tackling a single point might be easier to produce and consume.
Thanks again for your review,
Ted Hardie
RE: Last Call: <draft-hardie-privsec-metadata-insertion-05.txt> (Design considerations for Metadata Insertion) to Informational RFC