Fwd: The ability to automatically upgrade a reference to HTTPS from HTTP

From TimBL on the public TAG list. Of possible relevance to several WGs

here.
---------- Forwarded message ----------
From: "Tim Berners-Lee" <timbl(_at_)w3(_dot_)org>
Date: Aug 22, 2014 11:50 AM
Subject: The ability to automatically upgrade a reference to HTTPS from HTTP
To: "Public TAG List" <www-tag(_at_)w3(_dot_)org>
Cc: "SW-forum Web" <semantic-web(_at_)w3(_dot_)org>

There is a massive and reasonable push to get everything from HTTP space
into HTTPS.
While this is laudable, the effect on the web as a hypertext system could be
very severe, in that links into http: space will basically break all over
the place.
Basically every link in the HTTP web we are used to breaks.

Here is a proposal, that we need this convention:

         If two URIs differ only in the 's' of 'https:', then they may
never be used for different things.

That's sounds like a double negative way of putting it, but avoids saying
things we don't want to mean.
I don't mean you must always serve up https or always serve up http.
Basically we are saying the 's' isn't a part of the identity of the
resource, it is just a tip.

So if I have successfully retrieved https:x  (for some value of x) and I
have a link to http:x then I can satisfy following the link, by presenting
what I got from https:x.
I know that whatever I get if I do do the GET on the http:x, it can't be
different from what I have.

The opposite however is NOT true, as a page which links to https:x requires
the transaction to be made securely.  Even if I have already looked up
http:x < i can't assume that I can use it for htts:x.  But for reasons of
security alone -- it would still be against the principle if the server did
deliberately serve something different.

This means that if you have built two completely separate web sites in
HTTPS and HTTP space, and you may have used the same path (module the 's')
for different things, then you are in trouble. But who would do that?   I
assume the large search engines know who.

I suppose an exception for human readable pages may be that the http:
version has a warning on it that the user should accessing the https: one.

With linked data pages, where a huge amount of the Linked Open Data cloud
is in http: space last time I looked, systems using URIs for identifiers
need to be able to canonicalize them so tht anything said about http:x
applies equally to https:x.

What this means is that a client given an http:  URL in a reference is
always free to try out the HTTPS, just adding an S, and use result if the
 is successful.
Sometimes, if bowser security prevents a https-origin web page from loading
any http resources as Firefox proudly does, [1], is you are writing a
general purpose web app which has to read arbitrary web resources with XHR,
ironically, you have to serve it over HTTP!     In the mean time, many
client libraries will I assume need to just try HTTPS as that is all they
are allowed.

Or do we have to only build serious internet applications as browser
extensions or native apps?

For this any many related reasons, we need to first get a very high level
principle that if a client switches from http to http of its own accord,
then it can't be given misleading data as a result.

I suspect has been discussed in many fora -- apologies if the issue is
already noted and resolved, and do point to where it has

TimBL

[1]
https://blog.mozilla.org/tanvi/2013/04/10/mixed-content-blocking-enabled-in-firefox-23/

In order for this switch to be made, transitions

signature.asc
Description: PGP signature