Re: Content Types

Excerpts from ext.ietf-822: 27-Aug-91 Re: Content Types Ned
Freed(_at_)hmcvax(_dot_)claremo (4009)

[quote 1 -- Ned implies that the type field describes the document class: ]

if I get something
of type image/foo, where I have never heard of foo before, it is nice to
know that it is an image, and that trying to display it on my VT100 is 
probably
not going to work


[quote 2 -- Ned implies that the type field describes the standard
encoding class for the document format: ]

PostScript ... even has a binary form (level II) that does not encode
well as text.


I'm sorry to see that the type/sub-type distinction is still there,
because it muddies the waters considerably.  In particular, the
discussion indicates that people are confused about what the first field
of the `Content-Type' header is supposed to be used for.  Does it
indicate the particular document class of the message body (image,
formatted text, video) or does it indicate the encoding of the format of
that document (G3FAX, LaTeX, PAL)?  If it indicates the former, that
says that some composite format-types, such as Andrew, Interleaf, CDA,
etc., will show up in the sub-type field of several different types;
e.g., Interleaf would be perfectly eligible for either `text-plus' or
`image', depending on what it described.

Some other formats -- such as TeX, scribe, troff -- would show up only
in the `text-plus' field; assuming of course that troff+pic is different
from troff.

PostScript is perhaps more straightforward.  It is *simply* an image
format, even when it is sending images of textual documents.  (Yes, yes,
it compresses images of text documents in interesting ways that in some
encodings even allow the text to be retrieved from the image, and that
confuses some people :-).  But in important ways it is *not*
`text-plus-markup' as the true `text-plus' languages are.  However, in
that it is a programming language, it might appear under the
`application' type.

It would be better to regard the `type' subfield of the `Content-type'
header as a field independent of the formatting of the message.  It
would also be wise to regard the format encoding as a field independent
of the format.  In particular, we must *not* assume that a document
format of `Andrew' implies a document class of `text-plus' and we must
*not* assume that a document format of `PostScript' implies a transport
format of 7-bit ASCII.  I'd rather see two fields, `Document-class' and
`Content-type', but combining them into `Content-type' is OK as long as
they are separate in our minds.

We could invent a new top-level type that is used to represent things that
are composites, like PostScript, CDA, ODA, and Interleaf. However, I prefer
not to do this, but simply group each of these entities under the type that
it is most likely to be associated with.


One would certainly hope and expect that the use of composites for
documents will increase dramatically in the reasonably near future. 
Most of the non-UNIX world has been doing it for a few years now, and as
standards such as SGML, and products such as Interleaf and FrameMaker
proliferate in the UNIX world (and character-oriented TTYs die out),
UNIX users will join them.  It would be very short-sighted of us not to
have a document class for these things.

Bill