perl-unicode

Re: My favorite bug to fix for 5.8.0

2002-03-11 12:08:57
Markus Kuhn <Markus(_dot_)Kuhn(_at_)cl(_dot_)cam(_dot_)ac(_dot_)uk> writes:
Nick Ing-Simmons wrote on 2002-03-11 12:08 UTC:
 http://www.cl.cam.ac.uk/~mgk25/ucs/langinfo.c

For perl I think that we would want to treat "C" and "POSIX" as meaning
iso-8859-1 rather than ASCII.

Decide yourself. But understand that sooner or later, the "C" and
"POSIX" locales will be extended from "ASCII" to "UTF-8", and then you
will be faced with a backwards incompatible change if you had ISO 8859-1
in "C" so far. Backwards compatibility with a future extension of "C"/
"POSIX" to "UTF-8" is the reason for why under glibc 2.2, "C" is
explicitely ASCII and not "ISO\xA08859-1" at the moment.

But we have a pile of legacy stuff in US and UK with no locale set at all
(which defaults to "C" IIRC) - which have been happily processing iso-8859-1
HTML etc. for years. To suddenly have them barf on \xA33 or 49\xA2 is not 
acceptable.



Markus
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/