perl-unicode

Re: [perl #22111] perl::Encode doesn't handle UTF-8 NFD strings

2003-05-13 05:30:28
On 2003-05-06 at 15:58 +0300 Jarkko Hietaniemi sent off:
well, see: from_to claims to convert from encoding1 to encoding2. encoding1 in this case is utf-8. Also the non-composed UTF-8 is perfectly valid UTF-8 and there's absolutely no reason, why from_to($string,"utf8","latin1") should not work just because I used the NFD form and not the NFC form. Your example is just a way to work

You are assuming the equivalence of (pre)composed characters and
their composed forms.  Perl doesn't do this at any level.

you are not right.

$string = "äpfel";
$string_nfd=NFD($string);
$string_nfc=NFC($string);
if ($string_nfd eq $string_nfc) {
        print "This will be printed!";
}
if (NFD($string_nfd) eq $string_nfc) {
        print "This will *not* be printed!";
}

I still say that this is a bug and encode should be able to convert NFD("Äpfel") to latin1. If you say it shouldn't it's like saying an English translator shouldn't be able to translate American English, just because they have a few differnet words than the British folks.