ietf-822
[Top] [All Lists]

Re: UTF-8 versions (was: Re: RFC 2047 and gatewaying)

2003-01-12 09:00:35

Bruce Lilly wrote:
Andrew Gierth wrote:
since there have been no substantive changes made to UTF-8 _ever_ since its adoption as any sort of standard, why would you expect future changes to introduce incompatibilities?

Bad premise; every time Unicode changes substantively, the utf-8 specification necessarily also changes since it is a Unicode to/from octet stream transformation.

Why would you expect Unicode to change substantively?

The number of characters used for human communication doesn't seem to be rising much, and there's plenty of space left in the current specification. IIRC Unicode still uses less than 200,000 of the million-odd possible code points.

I suppose you could argue that Unicode adds alphabets. But do you think Unicode still hasn't reached the 20% mark?

When the Unicode consortium decides to include chicken scratching as "characters" and extends the maximum width to beyond 32 bits, even the 5- and 6-byte sequences (which exist in some "utf-8" specifications on the octet-stream side, but not in others) will have to change.

"Will have to change"... There's a little more than 5,000 languages on the globe. If every one of them were to invent its own kanji-like writing system with about 100,000 characters, that still wouldn't fill 32 bits.

--Arnt

<Prev in Thread] Current Thread [Next in Thread>