Bob Smart (Tue, 29 Oct 91 18:16:51 +1100):
Now a problem with ISO-2022 (which 10646/ATM/AUC seems determined
to share) is that the default meaning of octets before any escape
sequence is undefined. ...
UTF/10646 does not use "shifting" between character sets. There is
one unique 1-5 octet sequence for each codepoint.
In particular, even though 8859/1 maps to the first 256 code points,
the representation of codes A0-FF is *not* a single-octect A0 through
FF, but the 2-octet A0 A0 through A0 FF.
(This also nicely solves the problem of the line with only 1 registered
trademark AE being mistaken by a 7-bit mailer for a period 2E. AE is
sent in UTF as A0 AE.)
-drb