I agree; I certainly can live with a distinguished syntax for
character set information as part of the content-type. But I think it is
very important to realize that this is just syntactic sugar -- what we're
talking about here is basically a blessing of the concepts Neil and I have
been wanting all along.
It is a trifle more than that, at least from the perspective of
someone who has had no problem with the concepts. If one creates a
separate header, then rules are needed of the flavor of "ignore the
character set header if the type is 'frob'". What we all know is that,
sooner or later, someone is going to get this wrong and "help the user"
by assuming that having a character set header implies that content-type
'frob' was an error and treating the file as text/character-set=whatever
(temporary notation, Ned, not a syntax), especially if they appear in
some odd order. And, since it is easy to imagine some gateway adding a
new character set header line if none was there already ("just to
clarify the default") these cases will present themselves.
If one associates the character set with the type/subtype information
in the same header, I think there are slightly higher odds of getting
people to understand that "broken" really means "broken". And the
likelihood of gratuitous addition of, e.g., "//us-ascii" to something
that already says "Content-type: image/g3fax" seems to me to be pretty
low.
So, it is at least high-quality fattening syntactic sugar, not the
calorie-free variety.
Content-Type: text//us-ascii
Content-Type: text-plus/TeX//us-ascii
Certainly looks distinguished enough to meet my needs, as long as
there is a clearly-written syntax/parsing rule in RFC-XXXX that says
"there is no such thing as a null field".
We could also insist that the character set appear last, although this is
by no means mandatory.
Probably a good guideline, but, intuitively, something that could get
us into trouble if we encouraged people to use it as a parsing rule.
--john