perl-unicode

Re: Interpretation of non-UTF8 strings

2004-08-16 08:30:10
Marcin 'Qrczak' Kowalczyk wrote:

W liście z pon, 16-08-2004, godz. 15:32 +0100, Nick Ing-Simmons napisał:


Once we had 

use encoding qw(locale);

But it did not work well as not all locale implementations
give the API to return the encoding.


Yes, authors of C APIs about encodings are to blame; not everyone
supports nl_langinfo(CODESET), and iconv is not everywhere. Anyway,
lib/open.pm tries to guess when it's not available, so the same can be
applied to the encoding pragma.

Yes, the guessing algorithm is even documented (horrors!).

(And even en_GB can be in ASCII, 8859-1, 8859-15 (with euro), UTF-8, ...)


If nl_langinfo(CODESET) is available, it will tell the correct encoding
without having to know what en_GB means. I would understand if Perl's
use encoding(locale) doesn't work on systems where it's hard to guess
the locale encoding. Personally I don't care about Unicode on systems
which don't support the necessary standard APIs, but I do care about
ones that do.

And personally I don't care about people not caring about all systems :-)

--
Jarkko Hietaniemi <jhi(_at_)iki(_dot_)fi> http://www.iki.fi/jhi/ "There is this 
special
biologist word we use for 'stable'.  It is 'dead'." -- Jack Cohen

<Prev in Thread] Current Thread [Next in Thread>