[Top] [All Lists]

Re: Serious problem with nonspacing character mnemonics

1992-01-21 13:41:15
The question then is whether it applies to the character it preceeds or the
character it follows. According to Unicode Version 1.0 volume 1, "all such
(non-spacing) follow the base character which they modify" (this is on
page 17 in the second paragraph). However, these non-spacing characters
appear in T.61-8bit, and according to T.61 accented letters are "represented
by a sequence of two bit combinations. The first ... representing a
diacritical mark. The second ... representing a basic Latin character." (see
section 4.1.3.b in T.61)

These two usages are clearly at odds with each other. The only solution I
can see is to define two sets of non-spacing mnemonics so that the cases
where they preceed and the cases where they follow the character they modify
can be distinguished. Note that there probably won't be equivalents for
these additional mnemonics in Unicode and possibly not in 10646.

Well, 10646says they cover all ECMA registered character sets,
and I think also that UNICODE does the same.
So what should be read into this?
My understanding is that the 10646 character meaning something
combining *after* the letter, and the T.61 non-spacing diacritic
coming *before* the letter is actually the same character,
it depends on the character set how to interpret them.

I think you are putting too much meaning into the characters,
this would be like assigning the BACKSPACE character always the
meaning of ISO 646 where it can be used to make combined characters, or
to say that some control characters always mean something,
if they come in a special sequence.

RFC-CHAR does not go that far, currently.
RFC-CHAR only specifies what characters are at what codepoints.
The other thing is that RFC-CHAR does not cover 10646 nor UNICODE,
partly due to their unfinished state.
The third thing is that I plan to include a more
elaborate description of T.61 etc with the allowed combinations
that it has. T.61 only allows certain combinations of
floating diacritics and letters, a combination RING-ABOVE and
<i> is not valid, for instande.

One additional minor glitch -- the definition of '", the mnemonic for
double acute accent, appears to be missing from the current draft.

This was a fault in the troff formatting, it will be there in the next
draft, coming real soon now...


<Prev in Thread] Current Thread [Next in Thread>