[re: UTF-2 transition]
But I can't defend that position-- not only are
there the oft-repeated issues of US-bias, Euro-bias, or
Roman-character-bias, but, as the network becomes heavily used in Asia,
it seems to doom us to a second transition, presumably to unencoded
10646.
John, when you say "unencoded 10646", do you mean the 16-bit form or
the 32-bit form? (I have no idea whether the other 16 bits will ever
be used, but I'm just wondering which one you were referring to.)
Also, I don't understand why we would be doomed to a second transition
to unencoded 10646. Are you assuming that there will be heavy traffic
in non-ASCII text between, say, Japan and the US? If there isn't much
of this kind of traffic, surely people could put up with the
occasional Japanese character, encoded in lengthy UTF-2, flying across
(or under) the Pacific.
Also, it's not clear to me whether the Japanese will switch from the
current 2022-based scheme to something else for their Japanese
messages. Similary, it's not clear whether the Americans will switch
from ASCII to something else for English messages. Do you think the
encoding at the SMTP level will ever change for the usual English
text? I.e. ignoring for the moment words like na"ive.
Cheers,
Erik