Masataka Ohta writes:
I'm afraid that one of his proposal, ISO-10646-UTF7, is effectively of new
CTE rather than a new charset.
It does look that way, although I believe it would fit into other text
content types; in particular text/enriched; much better than a CTE-based
solution would.
Of course, because of Han unification, UNICODE is not a charset of MIME.
Of course, because of Masataka's continued objection to UNICODE, no one
has been game to register it as a MIME charset. :-)
I note here that Masataka's proposal for ISO-2022-JP-2 demonstrates what
we've been arguing all along: it is not enough to just have a character
encoding. There also needs to be some form of markup to distinguish
different usages of the same character encoding. ISO-2022-JP-2 uses
escape sequences to do markup, whereas a UNICODE version of text/enriched
would use <...> tags. The main difference I can see is that ISO-2022-JP-2
requires the use of markup, even when the whole message is in the same
language, but UNICODE can get away without markup for 99% of messages,
letting local conventions set the default language.
I still fail to see why Masataka objects to UNICODE since his own proposal has
to jump through the same markup hoops. The only advantage of ISO-2022-JP-2
that I can see is that it will work on existing terminals without special
software in some communities. A specious argument at best, since the rest
of the world does need special software to view ISO-2022-JP-2 anyway.
UNICODE has the advantage that if a message gets corrupted and the markup
is lost, there is still a reasonable character that can be displayed, which
is close enough not to cause the sky to fall in on the reader. Such corruption
could easily happen when a message is quoted. What happens with ISO-2022-JP-2?
People have tried time and again to add markup to UNICODE to satisfy Masataka
(e.g. language tags), but it just doesn't seem to satisfy him. *sigh*
Cheers,
Rhys.