Re: [Encode] How to support (Apple's) compound Unicode characters?

Dan Kogai <dankogai(_at_)dan(_dot_)co(_dot_)jp> writes:

On Monday, April 1, 2002, at 07:33 , Nick Ing-Simmons wrote:

Dan Kogai <dankogai(_at_)dan(_dot_)co(_dot_)jp> writes:

  I think I have found the reason why some of the encodings were 
missing
from Tcl's *.enc, which later turned into *.ucm.
  Apple makes use of Unicode compound characters too extensively, which
doesn't go well with .ucm, not to mention *.enc


encengine can convert UTF-8 sequences for sequences of
characters - but .ucm would need tweaking to allow
multiple <UNNNN>:

<UNNNN><UMMMM> \xYYYY


  I have recently found this undocumented feature but dared not use it.


I was not aware it was actually implemented ;-)

  I think it looks better if it were written as

<UNNNN+UMMMM> \xYY\xYY ....


I don't like the <UNNNN+UMMMM> part it will make the parsing messier.
The \xYY\xYY is of course what I meant ;-)


  it won't take much effort to fix it.  I think I can work it out 
myself.  Should we feed this back to IBM?


Why not?

We would have to be "sure" that Unicode was normalized as well.


  Right.  This is rather a tough part but Apple is one of the loudest 
advocate of Unicode so I *think* their map is correct.

Dan the Encode Maintainer

-- 
Nick Ing-Simmons
http://www.ni-s.u-net.com/