ietf-822
[Top] [All Lists]

Philosophy

1991-04-27 17:21:36

I think rfc-xxxx, or associated architecture rfc, should say something
about the philosophy of Content-type and Content-encoding -- particularly
as we are inviting other people to define their own.

I'm sure someone who has done Information Theory 101 will do a better
job than me. I hope when they do that they don't make it too technical.
Anyway I'll start the ball rolling.

The main objective of e-mail is human communication, and my initial
remarks apply to that aspect of rfc-xxxx. File transfer is discussed
later.

In very general terms it is clear that the Content-type is what the user
is interested in. The Content-encoding is information which, at least in
an ideal world, the user doesn't want to know about and shouldn't have
to know about. One of the aspects of the discussion on the list is whether
there are things we want to put in which don't fall in either camp.

I used to think that ISO2022 was just an encoding of Japanese so we should
have
        Content-type: Japanese
        Content-encoding: ISO2022, BASE64

It seems logical: the bytes are encodings of glyphs which are encodings
of a human language. But then you ask the key philosophical question:
how many steps away from the control of the mail user agent should we
go in specifying the Content-type? I think the answer is clear [and is the
one arrived at in the RFC-XXXX draft].

    The Content-type should represent exactly one step away from the
    mail user agents control.

So the step of interpreting the glyphs as Japanese is outside the control
of the UA, so it would be wrong to have a Content-type of Japanese.

To put it another way.

   The Content-type describes the state of the message when it is just
   outside the UA, and when it is just inside, and the relation between
   the two.

So we see that

        Content-type: ISO2022

is exactly the right level. It describes the glyphs on the outside (who's
further interpretation is outside the control of the UA) and the corresponding
sequence of bytes "just" on the inside of the UA, after all encodings have
been removed.

Things in the Content-encoding header are things which would, in a perfect
world, never be seen by the user. I realise that current plans are to allow
for the imperfections of the world, and I'm sure that conservative policy
is correct. But we're talking philosophy here, and I think that intention
describes exactly what we want to classify as Content-encodings. Once again
the draft RFC-XXXX is exactly right in these philosophical terms.

Attached Files
--------------

I think we can apply the above philosophical position to work out how to handle
attached files.

Consider the sort of example that is being bandied about

        Content-type: tar
        Content-encoding: compress, base64

Putting compress in the Content-encoding means we don't want the user to have
to know about it normally. So it means that the message will be automatically
uncompressed. Are you sure that's what you want? I think the recipient is
quite likely to want to keep it in a compressed file for a week till he has
time to look at it: then he'll want to have a look inside first with "tar t",
then maybe create a directory for it and run "zcat | tar x" and I doubt if
he ever wants to have the tar file in uncompressed form on his system. 

I can _imagine_ a mail interface for tar Content-types which would be quite
nice and let you do all those things above. Is that what we are talking about?
I don't think so [and if we were it would be a candidate for registering
with IANA, not putting it in the base RFC]. If the "Content-type: tar" is just
information for the user then it is quite wrong because it is only "tar" at
2 steps away from the UA. Correct is:

        Content-type: stream-file; original-name.tar.Z
        Content-encoding: base64
        Subject: Here's a compressed tar of those patches -- enjoy.

[I say stream-file (=unix-file) because, for example, the way in which
a stream of bytes would be used to represent a vms-file is not obvious.
Whichever way it is done vms-file would be a different Content-type to 
stream-file.  I say stream-file instead of unix-file because its simple 
(and sensible) system is shared by a number of other OSs.]

---------------------------------------------------------------------

Well, if you disagree with the conclusion I suggest you have a stab at the
philosophy you want before bursting into print. I've been wrestling with
this for days.

Bob Smart

<Prev in Thread] Current Thread [Next in Thread>
  • Philosophy, Bob <=