Ben Finney <ben plus ietf at benfinney dot id dot au> wrote:
The issue remains that the informational RFC presents useful mnemonics
for many characters, and there doesn't appear to be such a thing from
Unicode or ISO. That's the point of an update to RFC 1345: it serves a
purpose that I can't see served comparably well elsewhere.
You might not find much enthusiasm in the character-encoding community
for the mnemonics published in RFC 1345, and later as the so-called
"repertoiremap" in ISO/IEC TR 14652. These have been widely criticized
for their incompleteness, (real or perceived) arbitrariness, and lack of
extensibility to scripts not already covered.
Most people will agree that "a plus apostrophe" makes a handy mnemonic
for "a with acute," and "c plus comma" works well for "c with cedilla,"
but the system tends to break down rather quickly after that, with Greek
letters identified by an asterisk, Cyrillic by an equal sign, Hebrew by
a capital letter and plus sign, Arabic by a small letter and plus sign,
etc. There are numerous exceptions to these guidelines, especially when
the letters in question don't map cleanly to Basic Latin, and a large
number of non-ideographic characters have no mnemonic at all, even some
that were defined in ISO 10646 at the time RFC 1345 was published.
That is why you are unlikely to find an update to RFC 1345 that brings
the mnemonics up to date with 10646/Unicode: the task is almost
impossible, given the limitations of the system.
The motivation for inventing these mnemonics seems to have been to
specify characters "in a coded character set independent way," which was
perhaps a sensible goal in 1992 when the Universal Character Set was
quite a bit less universal. Today, however, virtually all non-10646
character sets are mapped to 10646 code points, not to alphabetic
mnemonics. Almost any charatcer that can be found in a national or
industry charset can be found in 10646. The need for a notation
independent of 10646 has passed.
Most modern operating systems allow the user to change the keyboard
layout (or define one's own) to gain access to frequently used
characters, and many applications and OS's define a special keystroke
(such as Ctrl+Q) that allows entry of any arbitrary character by
Unicode/10646 code point. You might consider one or both of these
approaches as an alternative to using RFC 1345 mnemonics for data entry.
Or, you can go ahead and use the mnemonics as they are, but resign
yourself to the fact that they will probably never be updated.
Speaking only for myself, as always.
--
Doug Ewell · Fullerton, California, USA · RFC 4645 · UTN #14
http://users.adelphia.net/~dewell/
http://www1.ietf.org/html.charters/ltru-charter.html
http://www.alvestrand.no/mailman/listinfo/ietf-languages
_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/ietf