Re: the gap regarding Archived-At


On Thu October 28 2004 16:38, Keith Moore wrote:


okay, I find myself coming to some somewhat uncomfortable conclusions:

1. There really are significant differences between 
   a) how email clients would use archived-at,
   b) how humans would use it to cut-and-paste into HTML, and
   c) how web browsers would use it.


Incidentally, Keith's message (which contains no indication that
it was copied to Martin) is archived at
http://www.imc.org/ietf-822/mail-archive/msg05043.html

The discussion to date hasn't really addressed a or c, it
has focused on b.  As Arnt has noted, use by humans does
not require a structured field at all. I can see no disadvantage
for human use as has been discussed to the following
alternative means of communicating information about
archived copies (using the URI for the copy of Keith's
message as an example):

1.  Comments: archived at
       http://www.imc.org/ietf-822/mail-archive/msg05043.html
   The Comments field has been around since at least RFC 724
   and therefore is more likely to be displayed by existing MUAs
   than Archived-At.  A message may contain an arbitrary
   number of Comments fields.  However, it does not solve the
   problem of the incompatibility with RFC 2046 message/partial.

2. Message-ID:
      <20041028163850(_dot_)32868cae(_dot_)moore(_at_)cs(_dot_)utk(_dot_)edu>
      (archived at http://www.imc.org/ietf-822/mail-archive/msg05043.html)
  Parenthesized comments are provided for in Message-ID fields.
  This method puts the archive information in the same place as
  the message identifier, and is fully compatible with message/partial
  fragmentation and reassembly.  However, it may be slightly more
  difficult to add (e.g. at a list expander) to a message which already
  has a Message-ID field (compared to adding a new Comments or
  Archived-At field).

Both methods use existing mechanisms explicitly designed for human
use. As both methods involve content which is not subject to
automatic retrieval of content, both avoid the security trap
implicit in (possibly automated use of) Archived-At.

   These differences are significant enough that I'm wondering if we
   really do want separate archived-at fields for web use and email use,
   or tags on the archived-at field that indicates whether the message
   is in original format and/or if other messages in the same collection
   can also be accessed.


That information may be implicit in the specific URI scheme indicated.
An imap scheme certainly indicates that the format is message/rfc822
and implies that other messages may be accessible.

2. For archived-at to be useful by email clients generally requires
   one or two of the following, in addition to the obvious client
   support for the protocol and server support for the message format:

   - mail archives that support IMAP access (or possibly NNTP)


Accessing such an archive requires access credentials, which
would have to be communicated (probably possible within
the URI).

   - a specification for making collections of mail messages available
     via HTTP (maybe WebDav) and/or FTP


multipart/digest is an existing media type, which can be
transferred via HTTP and/or ftp, which *is* a collection
of messages.

[...]

   and as much as I'd like to believe that email client vendors would
   enthusiastically add support for these, market conditions don't seem
   to favor supporting new functionality standards in email clients
   right now.


There may be a more fundamental issue: IMAP (or NNTP) access
requires either handling a URI or configuration of the necessary
parameters which would be conveyed in a URI (host name, port,
user credentials).  Certainly MUAs that provide IMAP client
functionality could be used without any additional design, but
the user would need to get the parameters from somewhere and
configure his MUA(s) accordingly.  MUAs don't typically handle
generic URIs directly (mailto URIs are a notable exception), but
instead may hand them off to a browser (which in turn might
invoke an MUA for an imap scheme URI).

Best compromise I see at this point:  Define some sort of 
keywords for archived-at.  e.g.

Archived-at: "<" URI ">" *(";" keyword [ "=" value ] )

where keywords might include

"native"      message available in native message/rfc822 format

[...]

"collection"  other messages in the same collection are also

[...]

I think the issue of a list of URIs might be revisited (and not
necessarily in a structured field), so that multiple access
methods to a given message could be specified (that does
not preclude multiple fields to indicate multiple places where
copies may be archived, except of course if comments in the
Message-ID field are used). So (a la RFC 2369 URI lists) a given
field could list URIs to access a particular message via http
(format indeterminate), imap, ftp (format indeterminate),
pop, nntp, etc.

I'd rather avoid anything resembling RFC 2231 parameters if
at all possible; you probably don't really want to have to deal
with
   Archived-At: <http://www.imc.org/ietf-822/mail-archive/msg05043.html>
     ; native*2*=%F2%F0%F0%F4%40 ;
     native*0*=EBCDIC-INT''%E3%A4%85k%40 
     ;native*1*= %F0%F4%40%C6%85%82%40 ;
     native*3*=%F2%F0%7A%F5%F7%7A%F5%F8%40
    ;native*4*=N%F0%F0%F0%F0

[...]

 And if you don't use any keywords, they don't take up
space.


The code needed to parse them takes space in the MUA.