Re: End to End thoughts [long]

If you've already bought into NAT, then pointing to an interception
proxy as "evil" is simply pot-kettle calling.  

I'd be quite happy to have application headers contain directives
to intermediaries describing the semantics (esp. "stateless; copyrighted; 
cacheable", "multipart; modifiable by insertion", "opaque") and authorizations 
("customizable by enduser", "insertions by brokers").  However, it's more
realistic right now to accomplish this by context, environment, and what
amount to service level agreements about content between
publishers and distributors.

I agree that there are problems associated with modifiable content,
but there are also huge advantages.  Some things really should look
different depending on where they are in the network.  It all depends
on what you mean by content and what the intent of the publisher
is.  To take a minor example, in traditional publishing the consumer
doesn't get to specify the color or size of the paper and the text fonts;
with browsers and HTML, the user can override the publisher's 
specification.  In the physical world, such tricks may well be illegal.

We need to assure that publishers and consumers can specify and
resolve their rights and preferences in protocols so that intermediaries can
exercise their capabilities in accordance with their expectations.

Hilarie

Mark Nottingham <mnot(_at_)mnot(_dot_)net> 09/18/00 09:14PM >>>


There's been a lot of talk in various places about end-to-end
problems, and their relation to intermediates (e.g., HTTP caching
proxies, surrogates, etc.).

Things have gotten somewhat religious. Rather than go down that road,
I'd like to examine the issues and separate them into distinct
problems.

* Network Layer

The end-to-end principle applies to *network* layer functions; it is
not meant to preclude an application-level gateway.

I _think_ that most everyone will agree that functions that break
end-to-end transparency at the network layer cause problems. In other
words,

                   Interception Proxies are Evil.

While it's nice to think that we can convince the world of this, it's
more realistic to find out why people use them, and give them a
better alternative.

To my knowledge, most people use them because they need an easy way
to assure that Web traffic (or at least port 80) goes through a
proxy. While browser have proxy auto-configuration, it hasn't changed
in quite a few years, there's no standard to implement, and it relies
on JavaScript.

WPAD was a good start; unfortunately, it never really went to far
beyond Microsoft. Additionally, a more sensible, standard format for
the configuration file would help. Over all of this, we need
participation from the browser vendors, or we won't get anywhere.

* Application Layer

All of this being said, HTTP intermediates *do* raise issues at the
application layer, which aren't end-to-end problems, but do get
confused with them, as the causes are similar.

Proxies and surrogates can do a number of things to requests and
responses as they flow through. I've been roughly classifying them
as;

- access control (e.g., blocking, filtering)
- request modification (URI rewriting) 
- response modification (transcoding)

These operations are often performed in the interest of one of
different parties:
  
- content provider
- access provider
- user

There have been numerous papers and discussions about the interesting
things that can be done by intermediates. However, I don't think
there's enough attention being paid to the problems that enabling
these functions on intermediates can cause.

* HTTP

The HTTP makes a primary assumption of semantic transparency - that
an entity body will not be changed 'in flight' - that, discounting
transfer-encodings and so forth, what the server sends in the
response is what the client gets.

Jeff Mogul brought up issues with entity identity, as expressed in
the ETag, in Lisbon. There was also general concern (it may have been
Jeff in particular, my memory isn't that good) that an operation
performed in one portion of the network may not be performed in
another, causing unpredicatable behaviours.

There are lots of other little examples here that should be covered,
but I'll leave it for now.

* URI

One of the premises of URIs is that they refer to a specific,
identifyable resource, and that the authority for that resource has
control over it.

Many applications have dependencies on this. PICS, Robots Exclusion,
P3P, and other kinds of metadata systems break, sometimes
disasterously, when intermediates change the semantics and/or payload
of a message. The privacy implications are especially of concern.

Overall, content providers have a reasonable expectation that the
object which they deliver will be the one that the user gets, unless
the user doesn't want it.


These effects are largely social and legal; however, they're enabled
by the technology that's being pushed out there. We need to reconcile
the capabilities of the products and frameworks that we're building
with the expectations that have been set by our base protocols.

It's interesting that Web caching seems to be losing ground to CDNs
so rapidly; to me, this illustrates the point perfectly. Web caching
was always performed in the interest of the access provider, not the
content provider (or arguably even the user). As a result, content
providers don't trust Web caches. What's it going to be like out
there when access providers can slip a transcoding, rewriting module
into a proxy on the fly?

One person who I've talked about this with pointed out the issues we
had a while back in caches vs. copyright, legally. There's much more
direct potential for problems here.


-- 
Mark Nottingham
http://www.mnot.net/