[Top] [All Lists]

Re: Character set registration

1995-12-18 17:01:55
One distinction between text/* and application/* which is highly
important in choosing which one to use, is that an unknown subtype
of text/* can be displayed as if it were text/plain.

The issue isn't whether unknown an subtype of text/* can be displayed
as if it is text/plain, it is whether an unknown charset can always be
treated as if it agreed with US-ASCII.

Personally, I prefer the model that suggests that text/* media types
are those that are logically considered as a sequence of character
objects and represented by a sequence of octets using a 'charset'
encoding, and that the restriction on 'charset' encodings is similar
to the restriction on transfer-encodings: don't send unknown
transfer-encodings to unsuspecting recipients.

Treating the 'charset' as a (nested) transfer-encoding has a lot of
advantages. Even for text/plain unicode, you might choose to use
base64 transfer-encoding with charset=unicode-1-1, quoted-printable
transfer-encoding with charset=unicode-1-1-utf8, or no encoding with
charset=unicode-1-1-utf7. The results would be the same sequence of
_characters_ but increasing legibility (for ascii text, at least) when
dealing with user agents that don't actually understand unicode.