perl-unicode

Re: Converting string to UTF-16LE

2004-03-02 09:30:06

On Mon, 01 Mar 2004 20:55:14 +0000
Nick Ing-Simmons <nick(_at_)ing-simmons(_dot_)net> wrote:

Larry Wall <larry(_at_)wall(_dot_)org> writes:
On Wed, Feb 25, 2004 at 06:19:02PM +0100, Sebastian Lehmann wrote:
: For this example the search value will be "Ibañez". Because of the search
: isn't case-sensitive, all letters should be uppercased, using the uc 
method.

I don't think this is your problem, but in general I think it's better
to canonicalize with lc() because it will try to undo both uppercase
and titlecase.

Since you are here ;-)

Why does ñ not uppercase to Ñ ?

Which bits of which Unicode.org files are used by uc()?

lib/unicore/To/Upper.pl includes a toupper mapping of ñ to Ñ properly.

00EE            00CE
00EF            00CF
00F0            00D0
00F1            00D1 <--HERE
00F2            00D2
00F3            00D3

If !SvUTF8() and C locale would be used,
uc() does not uppercase "\xF1" to any other character.

print uc(chr(0xF1)) eq chr(0xD1)
    ? "ok" : "not ok", " in ASCII\n";  # maybe not ok

print uc(pack('U', 0xF1)) eq pack('U',0xD1)
    ? "ok" : "not ok", " in utf8\n";   # of course ok

print uc(chr(0xF1)."\x{FEFF}") eq chr(0xD1)."\x{FEFF}"
    ? "ok" : "not ok", " with ZWNBSP\n";   # ok, it's unicodified

Regards,
SADAHIRO Tomoyuki