perl-unicode

Re: perlunicode comment - when Unicode does not happen

2003-12-28 09:30:04
Jungshik Shin <jshin(_at_)mailaps(_dot_)org> writes:

 Then, he should switch to en_GB.UTF-8. 

I probably will.

Besides, he implied that
he still uses ISO-8859-1 for files whose names can be covered by
ISO-8859-1, which is why I wrote about mixing up two encodings
in a single file system _under_ his control.

There is a tendancy for programs to assume that the locale's encoding
is used for the contents of the file. In the UK there are a LOT of files
which are not UTF-8 but iso8859-1 or iso8859-15.
As the "RedHat cannot build perl/Tk" e-mail barage proves this is a 
rash assumption. If I leave my locale as a 8859-1 one then octet == char
assumptions are "mostly harmless". If I switch to a UTF-8 locale and 
a stupid program dies because I spelt naive correctly in 8859-1
and that is a UTF-8 coding violation I don't gain much.


 Moreover, why would you think that en_GB.UTF-8 locale gives him the
time and date format NOT suitable for him? You're making a mistake of
binding locale and encoding. Encoding should never be a part of the
locale definition. 

That is EXACTLY the point Jarkko and I are making. The locale setting
really tells you NOTHING about the encoding. 
So when presented with 

if (-d "\x{20ac}4") ...

how is "locale" supposed to help poor Joe in his en_US.utf8 locale looking 
at a sub-dir created by Kurt in de_DE(_at_)euro or was it Karl in de_DE.utf8

 Before writing that, please read the man page of 'smbmount' and
'mount' if Linux system is available to you. They're not environment
variables.

I think you are on "our" side.


<Prev in Thread] Current Thread [Next in Thread>