[Top] [All Lists]

Re: Character-set header (was Re: Minutes of the Atlanta 822ext meeting)

1991-08-29 08:42:54
Like Einar and Mark, I'm very sorry to have missed the Atlanta meeting,
where it seems (to me) that negative progress was made.  I'm delighted
that we appear to be converging on a prohibition against nested
encodings.  Now I'd like to add my voice  to those who are unhappy with
the separate Character-set header.

The biggest problem I have with this header is that it will, much of the
time, be meaningless.  I have a problem with headers that are defined
because their meaning is semantically crucial to the message, but which
have no rational application.  Let's consider the 9 message types
defined by RFC-XXXX:

text -- character set info is meaningful, either as a subtype OR using

message -- character set info is potentially meaningful, but
problematic.  For a while, an RFC-XXXX draft had character sets as
subtypes of message, too, but this turned out to make the interpretation
of the message type (particularly the headers)  VERY problematic, and
was dropped.  The problems all reappear with the introduction of

text-plus -- character set info is potentially meaningful, but
problematic because many (most?) rich text formats already provide their
own mechanism for encoding multiple character sets anyway.  If you have
a text-plus format that has its own mechanisms for specifying character
sets, is the Character-set header simply ignored?  Ugh.  

binary, application, image, audio, video, multipart -- character set
info is meaningless.

Now, given these facts, what does it mean to a UA if it sees a
character-set header?  It means that you have to look at the
Content-type header.  If that happens to be "text", the meaning is
obvious, but otherwise it is at best confusing and at worst totally
undefined.  But if the Content-type has a critical impact on the
semantics of the character set specification, why shouldn't it be
specified as part of the content-type?  And if only one (or even 2 or 3)
 content-type can sensibly have character set information, why not make
that information part of the content-type for that (those) specified

I have yet to hear any clear reason why the "text/char-set" model is
inadequate, and I find the addition of a Character-set header
potentially very confusing.  I would strongly advocate that we return to
using text subtypes to specify character sets.  This is close to being a
showstopper for me, though I'm trying to keep an open mind.

Cheers.  -- Nathaniel