Jarkko Hietaniemi <jhi(_at_)iki(_dot_)fi> writes:
Nick Ing-Simmons wrote:
Once we had
use encoding qw(locale);
But it did not work well as not all locale implementations
give the API to return the encoding.
(And even en_GB can be in ASCII, 8859-1, 8859-15 (with euro), UTF-8, ...)
True.
For the open :locale I opted for a easy (cheesy?) algorithm:
(1) if we have langinfo(), use the return value of langinfo(CODESET).
(2) if we do not have getlanginfo(), look at %ENV for locale variables
and look at the part after the dot, and use that value.
(3) Use the value from either (1) or (2) and if Encode recognizes that,
good. Otherwise give up.
Or something like that. (It's documented in the open pragma, somewhere).
And I was mis-remembering which module it was that had 'locale'.
As I just posted I think it makes sense that
use encoding ()
as it affects strings in code below is literal - after all the
strings are in an encoding (determined by author), while
locale is variable by how user is using it.
So in my speech synthesis stuff I had:
use encoding qw(iso-8859-15);
And then it worked right even if I happened to run it in en_GB.utf8
that day.
(By the way I re-encoded to UTF-8 and changed that to 'use utf8',
it all still works but needs much more memory and is slower.
Seems to be way regexps work - so will probably switch back.)