Re: End to End thoughts [long]

I've been trying to get a handle on how to express the problems 
I have with the framing of the end2end principal in this context.
In the discussion below, Hilarie and Mark have obliquely pointed out
where the problem lies:  the distinction between service and protocol.

The end2end principal derives from one of the most fundamental aspects
of datagram networking, and it boils down to the notion that the end
points of a network connection should be the only ones who need to
care about the connection itself.  The intermediate devices should be
able to examine the datagram and know what to do (forward it
somewhere, almost always), but they should not have to care about the
connection between the two end points.

The problem arises with the conflation of that network connection with
a particular service provided on top of the network.  It's an easy
problem to have, because services have been historically provided at
specific network nodes.  Many URL schemes (and DNS names in general)
resolve to specific nodes and specific ports, and the reason that so
many people insert their service between the DNS request and the
resolution is that that step is the closest one we have historically
had to a service layer addressing scheme (yes, yes, SRVLOC
exists--don't jump ahead).  Once that resolution has taken place,
though, most things boil down to a service layered on a end-to-end
network connection.

HTTP is a classic example of this: a unicast request/response protocol
designed to make a local copy of a network-addressable resource.
Making a local copy of a network-addressable resource is a pretty
easy-to-describe service.  The scaling problems with using that same
unicast request/response protocol for every potential service on the
network, though, have meant that more and more has been added to the
protocol to support new service requirements or better support the
scale of the existing service.  As those got added in, that service
ceased to be quite so easily layered on an end-to-end network
connection.

One early result of this was that the network addressable resource
named in the HTTP URL might be provided by a node other than the node
to which the address resolved; this was only the case, though, when a
requestor had designated another node (a caching proxy) as a potential
intermediate.  Now service providers using HTTP also designate other
nodes as potential agents (via surrogates and CDNs).  With the service
now being potentially provided through a chain that looks like:

client--->caching proxy--->surrogate---->origin

origin--->surrogate---->caching proxy--->client

it's easy to see that the end2end connection between client and resource
is not a necessary part of the service provision, even if the service
provided is the same old making a local copy of a network addressable
resource.

There were, however, things you got for free when the client and the
origin were the only parties to the service: freshness guarantees,
some weak authentication, etc.  Now they all those have to be made
explicit.  As Joe Touch has said, you can't substitute the aggregate
of the hop-by-hop equivalents for an end2end connection.  TLS between
a client and a proxy, between the proxy and the surrogate and the
surrogate and the origin does not add up to TLS between the client
and the origin.

As Vladis, Keith, and others have pointed out, one of the results of
this is that modification can take place without the principal parties
to the service being aware of that modification.  That's bad, and we
all know it.  What we don't have a clear handle on is how to fix
the problem without ripping out what we have now.  I hope we can
agree that we must substitute a service-based view of the applications
for a connection-based view if we are to meet the current need.

On a related note, I believe that one barrier to this at the moment is
the continued use by the W3C of URLs as stand alone identifiers.  As
noted above, in some applications the resource named by a URL may be
supplied by some other agent in the service; knowing that, the W3C has
concluded that it is appropriate to use URLs as identifiers without
reference to whether the resource is or is not available at that URL.
Continuing to conflate a location and a resource once we have reached
this point creates many more problems, in my view, than it can possibly
solve.  Working to create service addressing mechanisms that do
not derive from network location seems to me a critical step to moving
beyond application designs which presume the wrong things about
service delivery.

                        regards,
                                Ted Hardie

PS: If anyone reads the above as being a purist preaching, please
understand that I am as big a sinner here as there is among us, and I
am not trying to disavow the cruft for which I am personally
responsible.


On Wed, Sep 20, 2000 at 11:03:51AM -0600, Hilarie Orman wrote:
[...]

I'd be quite happy to have application headers contain directives
to intermediaries describing the semantics (esp. "stateless; copyrighted; 
cacheable", "multipart; modifiable by insertion", "opaque") and 
authorizations 
("customizable by enduser", "insertions by brokers").  However, it's more
realistic right now to accomplish this by context, environment, and what
amount to service level agreements about content between
publishers and distributors.


I very much agree. Right now, I think we're at a point where we can try to
get a handle on what's acceptable without some form of negotiation.

I agree that there are problems associated with modifiable content,
but there are also huge advantages.  Some things really should look
different depending on where they are in the network.  It all depends
on what you mean by content and what the intent of the publisher
is.  To take a minor example, in traditional publishing the consumer
doesn't get to specify the color or size of the paper and the text fonts;
with browsers and HTML, the user can override the publisher's 
specification.  In the physical world, such tricks may well be illegal.


I won't dispute the benefits, but this does break some assumptions in
the HTTP and URI specifications. We need to deal with that.

We need to assure that publishers and consumers can specify and
resolve their rights and preferences in protocols so that intermediaries can
exercise their capabilities in accordance with their expectations.



Here's a more concrete example of the kind of problem I'm talking about:

The W3C's P3P working group is putting together a platform which allows
specification of an XML privacy policy based on a URI. The major mechanism
for this is a metadata file in a well-known location at the origin server
which dictates how policies are applied to resources.

If a proxy inserts ads, rewrites URIs, or does something else to change the 
privacy attributes of the resource, the privacy policy is invalid. If the
content provider doesn't know about this, they're making privacy guarantees
for resources that are beyond their control.

P3P is unaware of the potential for modification of an object in the
network, because a URI points to an identifiable resource, and semantic
transparency is assumed in the HTTP.

While the HTTP doesn't specifically forbid lack of transparency, it does put
limitations on how it may happen;

  Requirements for performance, availability, and disconnected operation
  require us to be able to relax the goal of semantic transparency. The
  HTTP/1.1 protocol allows origin servers, caches, and clients to explicitly
  reduce transparency when necessary. However, because non-transparent
  operation may confuse non-expert users, and might be incompatible with
  certain server applications (such as those for ordering merchandise), the
  protocol requires that transparency be relaxed

   * only by an explicit protocol-level request when relaxed by client or
     origin server

   * only with an explicit warning to the end user when relaxed by cache or
     client

There are many assumptions in efforts like P3P that will break when you
break semantic transparency.

I haven't brought this up in the WG yet, because I wanted to get comments
from this group. 

-- 
Mark Nottingham
http://www.mnot.net/