One of the things I'd like to do is get rid of
"Content-type: text" which, as Stef, has pointed out, is kind of
ambiguous. Neither Stef nor I, however, are sure what the right
replacement would be. Here are some possibilities:
Although "text" may sound ambiguous, the contents should be human readable.
I would like to suggest a slightly different approach. Warning: it may
be controversial. That is to create a new header (e.g. Codeset: )
to identify the codeset being used in the contents.
Why? I have an NROFF document which uses tbl and mm macros, but it contains
ISO-8859-1 characters. According to RFC 1148, the content-type becomes:
Content-Type: nroff; null; tbl, ms
There is no place to identify the character set/codeset. If a new header
is created, I can specify it as:
Content-Type: nroff; null; tbl, ms
Codeset: ISO-8859-1 (or whatever the convention we define)
Note, it is merely a suggestion. It is controversial because some
mailers don't care if nroff document contains non-7-bit-ASCII as long
as Content-Encoding does the right thing. If you feel that it is too
controversial, I will not mention it again.
Some people may feel that Content-Type should contain the semantic
meaning, but not the actual implementation. I am not saying that it is
a right way, but it brings up an interesting question. Should we
recommend the type names format: should it be in hierarchy format
(such as "company"-"type") or just a flat type space? For example, in
flat type space, if company A has implemented voice data-type and
registered it as "voice". When company B has implemented its own voice
data-type, it must register the type as anything other than "voice".
In hierachy format, the name will be "A-voice" and "B-voice". In a
drastic approach, there will be only *one* type "voice" registered and
it uses different field to identify the implementation. Any comment on
this?
-Vincent