Chris Hall skribis 2008-03-11 18:48 (+0000):
I'm comfortable with the notion that perl characters are unsigned
integers that overlap UCS, and happen to be held internally as a
superset of UTF-8.
I wonder if perl is completely comfortable.
It isn't. There are some very unfortunate "features".
chr(n) throws various runtime warnings where 'n' isn't kosher UCS, and
"\x{h...h}" throws the same ones at compile time.
(...)I'm not sure I see the point of picking on a few values to warn
about.
I don't see the point, but Perl's warnings are arbitrary in several
ways. Abigail has a lightning talk about the "interpreted as function"
warning, that illustrates this.
In any case, is chr(n) supposed to be utf8 or UTF-8 ? AFAIKS, it's
neither.
It's supposed to be neither on the outside. Internally, it's utf8.
If chr(-1) doesn't exist, then undef looks like a reasonable
return value -- returning "\x{FFFD}" makes chr(-1)
indistinguishable from chr(0xFFFD) -- where the first is
nonsense and the second is entirely proper.
0xFFFD is the Unicode equivalent of undef. I think it makse sense in
this case.
Could you please report this bug with perlbug?
Done.
Cheers.
--
Met vriendelijke groet, Kind regards, Korajn salutojn,
Juerd Waalboer: Perl hacker <#####(_at_)juerd(_dot_)nl>
<http://juerd.nl/sig>
Convolution: ICT solutions and consultancy
<sales(_at_)convolution(_dot_)nl>