Peter> Uncomfortable to say the least. Could a surrogate scalar encoding
Peter> be done as an escaped encoding where the high and low pairs are put
Peter> into the .enc files as HHHHLLLL where both H and L =~ /[0-9A-F]/?
Peter> hence necessitating a shift to reading 8 characters (possibly
Peter> implemented using the "E" mechanism?).
Yes. If you use surrogate pairs, the pair would represent a UTF-16 encoding.
If you combine them according to the Unicode surrogate formula, they would
then become a scalar that would represent a UTF-32 encoding.
Peter> How firmly established is the Tcl scheme? Is it still being
Peter> hammered out? I do think that it would be nice to avoid yet
Peter> another gratuitous file format incompatability if possible. So how
Peter> do the Tcl folks plan to handle surrogates or truly unrecognized
Peter> characters?
I don't know. I last used Tcl/Tk in the days of tcl7.?/tk4.? and haven't had
time to play with anything newer. I do prefer Perl :-)
-----------------------------------------------------------------------------
Mark Leisher
Computing Research Lab Cinema, radio, television, magazines are a
New Mexico State University school of inattention: people look without
Box 30001, Dept. 3CRL seeing, listen without hearing.
Las Cruces, NM 88003 -- Robert Bresson