Re: UTF-8 versions (was: Re: RFC 2047 and gatewaying)


Bruce Lilly wrote:

Andrew Gierth wrote:
since there have been no substantive changes made to UTF-8 _ever_since its adoption as any sort of standard, why would you expectfuture changes to introduce incompatibilities?
Bad premise; every time Unicode changes substantively, the utf-8specification necessarily also changes since it is a Unicode to/fromoctet stream transformation.


Why would you expect Unicode to change substantively?

The number of characters used for human communication doesn't seem to berising much, and there's plenty of space left in the currentspecification. IIRC Unicode still uses less than 200,000 of themillion-odd possible code points.

I suppose you could argue that Unicode adds alphabets. But do you thinkUnicode still hasn't reached the 20% mark?

When the Unicode consortium decides to include chicken scratching as"characters" and extends the maximum width to beyond 32 bits, eventhe 5- and 6-byte sequences (which exist in some "utf-8"specifications on the octet-stream side, but not in others) will haveto change.

"Will have to change"... There's a little more than 5,000 languages onthe globe. If every one of them were to invent its own kanji-likewriting system with about 100,000 characters, that still wouldn't fill32 bits.


--Arnt

<Prev in Thread]

Current Thread

[Next in Thread>

Previous by Date:

Re: RFC 2047 and gatewaying, Bruce Lilly

Next by Date:

Re: Call for Usefor to recharter, Bruce Lilly

Previous by Thread:

Re: UTF-8 versions (was: Re: RFC 2047 and gatewaying), Bruce Lilly

Next by Thread:

Re: UTF-8 versions (was: Re: RFC 2047 and gatewaying), Bruce Lilly

Indexes:

[Date] [Thread] [Top] [All Lists]