[Top] [All Lists]

Re: Character-set header (was Re: Minutes of the Atlanta 822ext meeting)

1991-08-29 11:50:53
Excerpts from mail: 29-Aug-91 Re: Character-set header (w.. Keith
Moore(_at_)cs(_dot_)utk(_dot_)edu (3395)

So the fact that a character-set header exists for a body part doesn't mean
anything unless the specification for its particular content-type defines
the meaning of a character-set header.  If a body part header is included
that isn't defined in the specification for that content-type, it should be

Yes, I'm not arguing that you can't *invent* an algorithm for
interpreting a Character-set header, but merely that in any such
algorithm, there will be many cases in which the answer is "ignore the
character-set header."  This smacks of bad design.  

As Keith points out, the more  headers you use to describe the contents,
the more complex it becomes to implement an external mechanism (the
mailcap approach) for interpreting the contents.  I suspect that other
implementation paths will similarly suffer from the increased
complexity.  The simplest approach, by far, is to have a SINGLE header
that describes the contents of the message.

This might suggest that I agree 100% with Mark Crispin, who wants to
fold even the Content-TransferEncoding into that single header.  Well,
it would be setting bad precedent to agree with Mark 100%, so I won't
:-)    In fact, I think that Content-TransferEncoding should be separate
precisely because it does NOT describe the content, but rather describes
how the content was transformed in order to be sent through the mail. 
This is a very different thing.  It is totally irrelevant, for example,
in the hypothetical all-binary-mail world some people foresee in the
future.  It is also totally separate in the mailcap approach, where it
makes sense to "undo" any encodings *before* you pass the message body
off to the external interpreter.  Similarly, I could imagine other
Message Stores or User Agents that undo the encodings upon receipt of a
message, and never worry about it again after that.  In other words, I
see the encoding issue as fundamentally very separable from the
Content-type, and character sets are just not that separable.  --