Re: Proposals for 10646/Unicode in MIME

Masataka Ohta writes:

I'm afraid that one of his proposal, ISO-10646-UTF7, is effectively of new
CTE rather than a new charset.


It does look that way, although I believe it would fit into other text
content types; in particular text/enriched; much better than a CTE-based
solution would.

Of course, because of Han unification, UNICODE is not a charset of MIME.


Of course, because of Masataka's continued objection to UNICODE, no one
has been game to register it as a MIME charset. :-)

I note here that Masataka's proposal for ISO-2022-JP-2 demonstrates what
we've been arguing all along: it is not enough to just have a character
encoding.  There also needs to be some form of markup to distinguish
different usages of the same character encoding.  ISO-2022-JP-2 uses
escape sequences to do markup, whereas a UNICODE version of text/enriched
would use <...> tags.  The main difference I can see is that ISO-2022-JP-2
requires the use of markup, even when the whole message is in the same
language, but UNICODE can get away without markup for 99% of messages,
letting local conventions set the default language.

I still fail to see why Masataka objects to UNICODE since his own proposal has
to jump through the same markup hoops.  The only advantage of ISO-2022-JP-2
that I can see is that it will work on existing terminals without special
software in some communities.  A specious argument at best, since the rest
of the world does need special software to view ISO-2022-JP-2 anyway.
UNICODE has the advantage that if a message gets corrupted and the markup
is lost, there is still a reasonable character that can be displayed, which
is close enough not to cause the sky to fall in on the reader.  Such corruption
could easily happen when a message is quoted.  What happens with ISO-2022-JP-2?

People have tried time and again to add markup to UNICODE to satisfy Masataka
(e.g. language tags), but it just doesn't seem to satisfy him. *sigh*

Cheers,

Rhys.