Re: RFC 1345 mnemonics table not consistent with Unicode 3.2.0

Ben Finney <ben plus ietf at benfinney dot id dot au> wrote:

The issue remains that the informational RFC presents useful mnemonicsfor many characters, and there doesn't appear to be such a thing fromUnicode or ISO. That's the point of an update to RFC 1345: it serves apurpose that I can't see served comparably well elsewhere.

You might not find much enthusiasm in the character-encoding communityfor the mnemonics published in RFC 1345, and later as the so-called"repertoiremap" in ISO/IEC TR 14652. These have been widely criticizedfor their incompleteness, (real or perceived) arbitrariness, and lack ofextensibility to scripts not already covered.

Most people will agree that "a plus apostrophe" makes a handy mnemonicfor "a with acute," and "c plus comma" works well for "c with cedilla,"but the system tends to break down rather quickly after that, with Greekletters identified by an asterisk, Cyrillic by an equal sign, Hebrew bya capital letter and plus sign, Arabic by a small letter and plus sign,etc. There are numerous exceptions to these guidelines, especially whenthe letters in question don't map cleanly to Basic Latin, and a largenumber of non-ideographic characters have no mnemonic at all, even somethat were defined in ISO 10646 at the time RFC 1345 was published.

That is why you are unlikely to find an update to RFC 1345 that bringsthe mnemonics up to date with 10646/Unicode: the task is almostimpossible, given the limitations of the system.

The motivation for inventing these mnemonics seems to have been tospecify characters "in a coded character set independent way," which wasperhaps a sensible goal in 1992 when the Universal Character Set wasquite a bit less universal. Today, however, virtually all non-10646character sets are mapped to 10646 code points, not to alphabeticmnemonics. Almost any charatcer that can be found in a national orindustry charset can be found in 10646. The need for a notationindependent of 10646 has passed.

Most modern operating systems allow the user to change the keyboardlayout (or define one's own) to gain access to frequently usedcharacters, and many applications and OS's define a special keystroke(such as Ctrl+Q) that allows entry of any arbitrary character byUnicode/10646 code point. You might consider one or both of theseapproaches as an alternative to using RFC 1345 mnemonics for data entry.Or, you can go ahead and use the mnemonics as they are, but resignyourself to the fact that they will probably never be updated.


Speaking only for myself, as always.

--
Doug Ewell · Fullerton, California, USA · RFC 4645 · UTN #14
http://users.adelphia.net/~dewell/
http://www1.ietf.org/html.charters/ltru-charter.html
http://www.alvestrand.no/mailman/listinfo/ietf-languages


_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/ietf