ietf-822
[Top] [All Lists]

Re: UTF-8 versions (was: Re: RFC 2047 and gatewaying)

2003-01-12 09:52:38

Arnt Gulbrandsen wrote:

Why would you expect Unicode to change substantively?

The 3.0->3.1 experience. A.k.a. "once burned, twice shy".

The number of characters used for human communication desn't seem to be rising much, and there's plenty of space left in the current specification. IIRC Unicode still uses less than 200,000 of the million-odd possible code points.

Famous last words.  From my handy dead-tree copy of Unicode 2.0, page
2-4, under the "Full Encoding heading":

"There are over 18,000 unassigned code positions that are available for
future allocation. This number far exceeds anticipated character encoding
requirements for all world characters and symbols."

Cough, cough.  It is nearly a universal truth that things tend to expand
to fill the available space (and/or time).  Why do you (apparently) think
that Unicode is exempt?

I suppose you could argue that Unicode adds alphabets. But do you think Unicode still hasn't reached the 20% mark?

They add more than "alphabets", and that's part of the problem. Again
quoting Unicode 2.0 (page 1-3 this time):

"Graphologies unrelated to text, such as musical and dance notations, are
outside the scope of the Unicode Standard."

Yeah, right. What did the Unicode Consortium add in 3.1? -- musical notation.
So, as I said, I will not (anymore) be surprised when the Unicode
Consortium adds (literally) chicken scratching, probably shortly after
they add (if they haven't already) dance notation (human and otherwise),
every possible organic chemistry carbon ring symbol, Feynman diagrams,
CAD/CAM symbols, traffic sign symbols, trademarks, logos, etc. ad nauseum.
N.B. no smiley. [that reminds me; they'll probably add all of the smiley
variants too (if they haven't already)]

I was once a Unicode fan -- the above quotations from earlier Unicode
Standards and "The Ten Unicode Design Principles" that were also in the
earlier standards were sound principles.  Unfortunately the Unicode
Consortium appears to have abandoned many of those principles.


<Prev in Thread] Current Thread [Next in Thread>