Re: RFC 1345 mnemonics table not consistent with Unicode 3.2.0

John C Klensin wrote:

--On Friday, 31 August, 2007 01:00 +0200 Harald Alvestrand
<harald(_at_)alvestrand(_dot_)no> wrote:

Harald, Ben has pointed out one important use for something like
1345, which involves references to characters in programming
languages and command interfaces.  The Unicode names are bad

news for that, I certainly don't want

        characterNamed(SLOBBOVIAN LOWER CASE COMBINATION
        LEFT-HANDED SPANNER)

in those contexts, and that is what Unicode would give me.  Our
current solution to that problem seems to be U+[N[N]]NNNN, which
is pretty unattractive (except when compared to all of the other
alternatives).  On the other hand, one could argue that 1345
inadvertently proves that no shorter set of mnemonics is going
to work across all of Unicode without becoming pretty arbitrary
and discriminatory against scripts not familiar to the creator
as well as difficult to extend.

Two different threads here: one about the idea of mnemonics, the otherabout this specific document's implementation of it...

Actually I used 1345 mnemonics in a fairly hefty piece of work back in1995 (draft-alvestrand-lang-char, I think the latest published versionwas -03). Ten years later, I'm unable to figure out what characters Iwas trying to point to in some cases; somehow, characters snuck in where"it's obvious that the mnenmonic for X has to be *X", but 1345 doesn'tprovide a definition for "*X". For cases where the correct mnemonic was"+X" and the draft specifies "*X", it's impossible to tell by anythingshort of character-by-character lookup that I goofed.

Based on that experience in working with 1345, I claim that the idea ofa larger set of "mnemonics" than what one can memorize in an hour or twofor handling data in a wider character set than the one you're writingin is a Bad Idea. Tried it, didn't work.

In programming language constructs intended to be read and maintained bypeople who aren't familiar with the script they're maintaining andaren't willing to bother looking up the code every time they use it,"characterNamed(SLOBBOVIAN LOWER CASE COMBINATION LEFT-HANDED SPANNER)"is exactly the right construct, in my opinion; if people can read thescript, an UTF-8 environment is a far cleaner solution than any possiblemnemonic set.

The second part of my criticism involves the tables in 1345 that claimto show existing character sets and what characters they contain. Thesetables are defined inconsistently with their base specifications (ISO646 IRV-NO is a 94-character ISO 2022-based charcter set, but presentedin 1345 as if it was an 128-character one, without explaining whatcontrol character set it is matched with to create that set, forinstance), and, as Ned says, contain errors.


Both are good reasons to ignore 1345 as it currently stands, in my opinion.

If anyone wants to resolve the second by creating a revision, feel free.But I don't see how it can help with the first one.


                          Harald


_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/ietf