perl-unicode

Re: 5.8 roadmap and Encode

2002-03-03 04:49:41
On Sat, Mar 02, 2002 at 07:15:09PM +0000, Nick Ing-Simmons wrote:
There are some other warnings running compile without -Q 
e.g. the attached.

Yes, that's a known issue. GNU liviconv's test/ lists CP950, EUC-TW,
ISO-IR-165, BIG5-HKSCS as four encodings containing many-to-one
(irreversible) mappings.

It seems that some of these encoding are not round-trip safe.
One reason for prefering .ucm is that by declaring one of multiple
map chars a fallback one can get the "right" thing for e.g. <U00F3>
is that 2B2E or 282E ?

Understood. But if '.enc' will consistently default to the larger one,
it seems good enough to me, as there is no well-defined "right" behaviour
as far as I know.

(in this particular case, as the 2B plane is the larger half-width pinyin
 character, it's also arguably the right thing.)

Test case?

Against the table itself, I suppose? Ok, I'll write one.

/Autrijus/

Attachment: pgp72rHtqjQXl.pgp
Description: PGP signature

<Prev in Thread] Current Thread [Next in Thread>