perl-unicode

Re: Reading/writing non-Unicode files with perl5.8?

2003-01-14 04:30:05
On Mon, Jan 13, 2003 at 03:35:37PM -0800, Deneb Meketa wrote:
I'm a longtime 5.005/5.6.1 user.  I recently upgraded my
Linux system to RH8.0 and got perl5.8 in the bargain.  I
have many perl scripts that read or write non-Unicode files,
mostly ANSI files.  Many of those scripts have broken,
seemingly because of Unicode-forcing behavior in perl5.8.

(It is possible that some other part of my system upgrade is
responsible, like maybe my shell; if anyone knows of some
kind of system-wide Unicode infestation that could be the
cause of these problems, please let me know!)

RedHat 8 defaults to setting UTF8 locales.
UTF8 locales cause perl5.8 to switch to Unicode mode, because perl assumes
that you meant to set a UTF8 locale.

three bytes.  I understand the Unicode translation that is
happening here, I just don't want it!

What I'm reading is not a UTF-8 file - it's an ANSI file!
Is there some way to tell perl to just read the bytes without
translation?

Changing your locale to not be UTF8 should stop all the translations.
(Make sure that the environment variables LANG, LANGUAGE and LC_ALL
and LC_CTYPE don't contain a string matching /utf-?8/i)
I don't know what sets these variables on RedHat systemwide, so I don't
know how to change them.

My personal opinion is that it was premature of RedHat to make RedHat 8.0
*default* to using UTF8 locales, given the general state of UTF8 support
in most programs running on Linux. Others may disagree.

Nicholas Clark

<Prev in Thread] Current Thread [Next in Thread>