Dave Crocker a écrit:
[...] (No, folks, please don't correct me it
is any other number over 16.)
It's now 0x10FFFF.
Cramming those bits into a 7-bit environment is just one more cramming
effort. It stands equal to the others as an alternative that has benefits
and detriments.
No, because the only standardised 7-bit encoding for unicode is UTF-7
that has so many drawbacks it's strongly deprecated.
If you have a text editor, you can view the 7-bit encoding. The fact that
it is "ugly", therefore is actually a feature, not a bug.
Don't worry, the "pretty" aspect of UTF-8 is not that, it's truly that
it's already implemented in most any client that is able to support more
than one encoding (many more than the one that would support UTF-16).
If it were not, a new seven bit encoding would be better.