RE: Last Call: <draft-hardie-privsec-metadata-insertion-05.txt> (Design

Hi Ted,

Thank you for the answers.

Please see inline.

Cheers,
Med

De : Ted Hardie [mailto:ted(_dot_)ietf(_at_)gmail(_dot_)com]
Envoyé : lundi 6 mars 2017 18:14
À : BOUCADAIR Mohamed IMT/OLN
Cc : ietf(_at_)ietf(_dot_)org;
draft-hardie-privsec-metadata-insertion(_at_)ietf(_dot_)org
Objet : Re: Last Call: <draft-hardie-privsec-metadata-insertion-05.txt> (Design
considerations for Metadata Insertion) to Informational RFC

Hi Mohamed,
Replies in-line.

On Mon, Mar 6, 2017 at 1:48 AM,
<mohamed(_dot_)boucadair(_at_)orange(_dot_)com<mailto:mohamed(_dot_)boucadair(_at_)orange(_dot_)com>>
wrote:

• A Forward-For header inserted by a proxy does not restore any data;
it does only reveal data that is already present in the packet issued by the
client itself.
That's what restore means here.
[Med] Then, this needs to be defined in the document. I naively assumed that
“restored” is used to mean any piece of information that the client does not
want to insert in a packet, but an on-path device decides to inject it despite
there is no consent from the client.
What you are describing is more about “maintaining” or “preserving” information
not restoring it.

The common uses of restore in English all focus on putting something back that
has been lost,
[Med] But that information is not lost for an on-path device that encapsulates
a packet in another one (so the inner header is still carrying the source IP
address) or the one that supplies the original source IP address as a metadata
when source IP address/port rewriting is required. The notion of “putting back”
does not make sense to me because we are not dealing with the internal
processing of a packet within an on-path device, but we only focus on the
external behavior. This is exactly the role of “via” headers for SIP proxies;
when there is a mismatch the received tag is completed with the visible source
address.

so I believe restore is better than "maintain" or "preserve", which imply
something is being carried forward as-is, rather than being put back after loss.
[Med] Please see above. Because we don’t have a standard behavior of an on-path
device (proxy, tunnel-endpoint.), I seems weird to me to say that a proxy that
preserves the source IP address is “putting pack an information that is lost”.

If the information is present as metadata in the packet sent to the proxy but
would be absent as metadata under normal operation of the proxy, adding it back
in somewhere else restores the metadata.
[Med] “normal operation of proxy” is not a standard. A “normal operation of
proxy” would be to maintain the information sent by the client when relaying it
to the server. I’m sure you know for instance that SIP B2BUAs can do whatever
they want!

You're right that the normal operation of a proxy is not a standard, and I
should have said "the normal operation of the protocols used by a proxy".
[Med] This is much better, but still not sufficient. On-path devices that
manipulate packets may not be a “protocol-specific proxy”: tunnel endpoint
(e.g., LISP), CGN (NAT64, NAT44, DS-Lite), MAP-E BR, etc.

If the action of the proxy is to start a new TCP connection to an origin
server, for example, the normal operation of TCP is to use the initiator's IP
address.
[Med] This is protocol-specific. I can provide an example of a proxy behavior
that relays the source IP address/port as part of its normal operation:
http://www.haproxy.org/download/1.8/doc/proxy-protocol.txt
- TCP/IPv4 :
"PROXY TCP4 255.255.255.255 255.255.255.255 65535 65535\r\n"
=> 5 + 1 + 4 + 1 + 15 + 1 + 15 + 1 + 5 + 1 + 5 + 2 = 56 chars

- TCP/IPv6 :
"PROXY TCP6 ffff:f...f:ffff ffff:f...f:ffff 65535 65535\r\n"
=> 5 + 1 + 4 + 1 + 39 + 1 + 39 + 1 + 5 + 1 + 5 + 2 = 104 chars

The loses the IP address of the querying host is implied by that normal
operation(in other words, it elides metadata about any client that caused this
new TCP connection to be createD).
[Med] This makes sense if losing the original IP address is an intended
propriety of the proxy. But this cannot be a generalized proxy behavior (see
the example above).

So origin IP address starts out in the IP header of the original packet but
gets pushed from that slot when the proxy constructs the onward IP packet to
the server. For it to reach the server, it has to be placed somewhere else in
the onward packet, restoring the lost metadata.
[Med] The client agreed to send packets with its source IP address (which mean
consent). Why the proxy would need to an extra channel to get consent for
relaying the source IP address to a server?

Because the client agreed to send packets to the proxy by putting it in the
destination
[Med] The client is not even aware that proxy exists on the path! Packets are
sent to the ultimate server’s address, not the one of the proxy. Even for SOME
cases where packets are sent explicitly to the proxy (e.g., SOCKS proxy), a
state is already in place to graft the outgoing packets to a binding context
involving the destination server.

, and did not agree to general disclosure; you can't infer onward consent.
[Med] Hmm…I’m afraid this conclusion is not technically backed, e.g.,
* A client that sends packets to a server located on the Internet is NOT
necessarily aware that a proxy is solicited in forwarding path. Packets are
sent using the server’s IP address.
* The client and proxy may be owned by the same administrative entity (case of
enterprise networks). That entity is responsible for ensure which information
the proxy needs to leak.
* The proxy and the server may be owned by the same administrative entity
(content provider). Supplying data by a proxy to the server, based on the
content of a packet received from a host, does not induce a privacy concern
here because the proxy and the server owned by the same entity.

Had it been present in the packet as header value in the HTTP exchange, it
would not have been stripped by normal operation. There proxy operation
forwarding it on would be simply preserving it.
[Med] This is another question: whether the same or distinct channel can be
used to communicate the SAME data that was present in the initial packet issued
by a host.

That depends on the nature of the channel. Obviously, if you set the origin
clients IP address as the source address, you're going to get a different
result from that spoofing than putting it in a client subnet EDNS option or
forwarded-for header.
[Med] Agree.

• An address sharing device, under for example DS-Lite (RFC6333), that
inserts the source IPv6 prefix in the TCP HOST_ID option (RFC7974) is not
RESTORING any data. The content of that TCP option is already visible in the
packet sent by the host.
I agree with the IESG analysis of RFC7974. It does restore information by
taking information which normal operation would have elided and restores it.
[Med] The implication of what you are saying here is that proxies are good
because they hide the source IP addresses of host!

Aggregating proxies can have a positive privacy impact, yes. An observer
seeing traffic from an aggregating proxy to
sensitive-topic.example.com<http://sensitive-topic.example.com> knows only that
some user behind that proxy is looking for information on sensitive-topic. To
know which user, the observer must have either suborned the proxy or have a way
of observing traffic between hosts and the proxy. Both are more expensive and
at higher risk of discovery than a simple tap near
sensitive-topic.example.com<http://sensitive-topic.example.com>.

[Med] The main point here is that, even in the presence of an aggregating
proxy, a server can demux users by correlating various information leaked at
the application layer (e.g., https://panopticlick.eff.org/). Tracking those
users when they change their source IP address is possible in this case, too.

If the data is taken from a portion of the packet that would not normally be
forwarded to an upstream host and added to a portion that is forwarded to an
upstream host, then the device adding the data back in should know it is a
restoration.
[Med] That definition is not trivial as mentioned above. I would use “preserve”
or “maintain” rather than “restore”.
Please see above. "Restore" is closer, in my opinion, than either preserve or
maintain.

If the endpoint sends the data, data will be consistently available in that
header. The data changes, of course.
[Med] I’m not sure to follow you here. What is meant by “consistent
availability” then? Do you mean the same channel/procedure to communicate the
information? Or “consistent data”?

I mean that if you define a protocol such that a well-formed message from the
client has the data the server needs, it will be consistently available. If
you rely on intermediate network devices to add the data, it may not be
available if there is not cooperating network device on path (e.g. if the DNS
resolver does not support the relevant EDNS0 option).

[Med] Thank you. Please clarify this in the draft. I had troubles to parse what
you meant by “consistent availability”. That’s said, there might be also “not
cooperating on-path devices” that may strip/alter the content of client
supplied data (easy for HTTP for example).

[Med] Resources may not be restricted to CPU or disk but may be granting access
to the service (e.g., download a file when a quota per source address is
enforced). It can be whatever the servers consider to be critical for them; it
is up to the taste of the service design to characterize it. The NEW wording
proposed above is technically correct. Please reconsider adding it to the draft.

I did consider it, but I continue to believe that it moves the needle too far
into simple server preference. I retained the original PSAP language in -07 as
a result.
[Med] emergency is only an example ; other services may exist that impose the
same trust model.

I think there is a qualitative difference between situations in which the
resources at risk are human lives and those where they are host resources.
[Med] I agree with you as an individual. But, it is not up to us to mandate
this condition for executing services. It is up to the (protocol)
designers/service providers to decide what is critical/key for their service
operation.
That's why the carve out was limited in the GEOPRIV case.
[Med] GEOPRIV is not the only protocol/service that is concerned with human
lives, we can consider vehicular networking that trust the information shared
by the infrastructure. I prefer neutral wording that cites emergency as an
example.

I also added a note about your extensive review. While you and I clearly have
some differences of view, the document has gotten better from your engagement
with it, and I appreciate your efforts.
[Med] I reviewed the -07. Although it is better compared to -05, I still don’t
think it is ready to be published as it is. Thank you for your effort.
And thank you for yours,
regards,
Ted

regards,
Ted

RE: Last Call: <draft-hardie-privsec-metadata-insertion-05.txt> (Design considerations for Metadata Insertion) to Informational RFC