Re: the gap regarding Archived-At


Hello Keith,

I'm just working now on another version of my draft.
I'll address your points below.

At 05:38 04/10/29, Keith Moore wrote:
>
>okay, I find myself coming to some somewhat uncomfortable conclusions:
>
>1. There really are significant differences between
>    a) how email clients would use archived-at,
>    b) how humans would use it to cut-and-paste into HTML, and
>    c) how web browsers would use it.
>
>    These differences are significant enough that I'm wondering if we
>    really do want separate archived-at fields for web use and email use,
>    or tags on the archived-at field that indicates whether the message
>    is in original format and/or if other messages in the same collection
>    can also be accessed.

I think a single header can go a long way. If it turns out
that it's not enough, we (you?) can always create another one.


>2. For archived-at to be useful by email clients generally requires
>    one or two of the following, in addition to the obvious client
>    support for the protocol and server support for the message format:
>
>    - mail archives that support IMAP access (or possibly NNTP)

The archived-at header could point to it easily, or because
IMAP is more about collections, it would actually be the
List-Archive header, which we already have.

>    - a specification for making collections of mail messages available
>      via HTTP (maybe WebDav) and/or FTP

Things could be as simple as defining an HTML 'rel' value to
point from an archived message in HTML format to the whole archive,
or to use List-Archive and do a robot-like search on URIs that
have the same path as the URI in List-Archive. You are of course
free to write such a specification.

>    - mail archives that follow the aforementioned specification

I don't expect W3C to provide IMAP access to archives soon, but
there is quite a chance that we will provide negotiated access
to message/rfc822 in the nearer future (can't promise anything
at the moment, but things are looking better than they have in
the past).

>    and as much as I'd like to believe that email client vendors would
>    enthusiastically add support for these, market conditions don't seem
>    to favor supporting new functionality standards in email clients
>    right now.

I agree it's a tough climate now, but Web browsers were in a similar
situation a couple years ago and the situation is improving now, so
all hope is not lost.

>3. The obvious compromise that makes sense in the short term (let
>    archives be in other formats besides message/rfc822, and don't
>    require message/rfc822 support) is harmful in the long term.

I don't see that. At W3C, we currently don't have message/rfc822
support, but we may be able to add it rather soon, after having
served our archives as HTML only for years.


>Best compromise I see at this point:  Define some sort of
>keywords for archived-at.  e.g.
>
>Archived-at: "<" URI ">" *(";" keyword [ "=" value ] )
>
>where keywords might include
>
>"native"  message available in native message/rfc822 format
>            (either because that's the only format available
>                 at this URI or via content-negotiation)
>
>"collection"      other messages in the same collection are also
>            accessible, where the collection is defined by
>            the IMAP folder, NNTP newsgroup, FTP directory,
>            WebDav collection, etc. indicated by the URI.
>
>            the value associated with this keyword would
>            indicate the name of the collection associated
>            with the URI (since a message might appear in
>            more than one collection)

This is just totally against Web architecture. Ever seen a link
that tells you beforehand where exactly it's going? You might
be able to guess some things e.g. from file extentions, but
you really only know when you get it (or when you do an HTTP
HEAD if you want to be cautious). Still the Web works amazingly
well.


>This would (a) let tools record locations of ordinary HTTP/HTML archives
>such as are (too) often the only format available today,

Yes, that's the primary purpose of Archived-At as used today.

>(b) encourage
>archives to provide messages in their original format without requiring
>them to do so,

The current draft already does so, the original format and
the fact that it doesn't loose information are explicitly
mentioned.

>(c) give implementors a hint that better functionality
>can be had,

See above.

>(d) give email readers a clue as to whether they could
>actually make use of the URI internally or whether they needed to pass
>it to a separate browser (this is a common problem in HTML also) and

HTTP HEAD can easily take care of that. Adding hints about what
may be on the other end to URIs hasn't been done up to now, and
for a very good reason: It would make it impossible to update the
archived representation. As an example, immagine W3C would add a
"HTMLonly" value to each of the messages going through one of our
mailing lists. This would reduce the incentive for us to also offer
message/rfc822, because even if we did (and we keep the original
data for all of our archives, that's not the problem), the email
agents out in the field would see "HTMLonly" and not try to get
the available information.

>(e)
>leave room for expanded functionality without needing to define a new
>header field.  And if you don't use any keywords, they don't take up
>space.

I don't mind another header if it really brings new functionality.

>It might be better to defer the definition of any keywords to a
>separate document, because I can imagine needing to define specifics
>of how collections look within the context of various protocols.
>I can also imagine needing to define things like collections that
>consist of a directory of several mbox files, each containing
>multiple messages.

Yes. Why don't you go ahead with a draft on collections? But
I think the pointer to a collection should not come from
Archived-At, but either from List-Archive or from some information
in the retrieved message or it's (e.g. HTTP) header.
Regards, Martin.