perl-unicode

Re: My favorite bug to fix for 5.8.0

2002-03-10 11:30:29
Nick Ing-Simmons wrote on 2002-03-10 14:07 UTC:
Can we Configure test for working langinfo(CODESET), and if good enough
make :locale the default

I think that is a good approach. If an environment has a working
langinfo(CODESET) implementation, this usually means that locales are
well supported and widely used (this is today at least the case under
Linux since glibc 2.2 and under Solaris, haven't touched anything else
recently).

It is also what the latest TCL release does, what Emacs 22 will do, etc.

On systems where langinfo(CODESET) does not work, you can consider to use
the following emulator routine, which extracts encoding names from
environment variables:

  http://www.cl.cam.ac.uk/~mgk25/ucs/langinfo.c

The return values of langinfo(CODESET) are unfortunately not yet
standardized. For UTF-8, fortunately, every known POSIX platform uses
"UTF-8" and nothing else, but there is some variation for older
encodings, and the normalization routine

  http://www.cl.cam.ac.uk/~mgk25/ucs/norm_charmap.c

takes care of this and turns every known langinfo(CODESET) value into
the corresponding MIME charset name.

More on such things on

  http://www.cl.cam.ac.uk/~mgk25/unicode.html#activate

Markus

-- 
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org,  WWW: <http://www.cl.cam.ac.uk/~mgk25/>