perl-unicode

Re: Workaround to a unicode bug needed

2010-09-06 15:02:56
Dear Enrique,

This does not work either. Here are the results I get with the input:
»Tjuvgömmare!» säga skatorna och se ut som samvetet självt. »Vi äro 

1/ What the correct output should be (I get this output when I remove the û 
character from the tr// list):

Tjuvgömmare
!
säga
skatorna
och
se
ut
som
samvetet
självt
.
Vi
äro

2/ The output with "use utf8;" (the accented characters are not recognized and 
the quotes are transformed into nonUTF-8 characters that correspond to <C2> in 
Latin 1. The terminal can't interpret them and displays a question mark 
instead):
?
Tjuvg
mmare
!
?
s
ga
skatorna
och
se
ut
som
samvetet
sj
lvt
.
?
Vi
ro

3/ With 
use utf8;
binmode(STDOUT, ':utf8');
I get (this time, the terminal can display the <C2> as a Â. This is not 
correct. It strips the accented characters):
Â
Tjuvg
mmare
!
Â
s
ga
skatorna
och
se
ut
som
samvetet
sj
lvt
.
Â
Vi
ro

4/ With binmode(STDOUT, ':utf8') only (Then, there is a combination of wrongly 
coded quotes in Latin 1 or Latin 9  that the terminal displays and accented 
characters that are shown with their UTF-8 substitutes interpreted as Latin 1 
or Latin 9 characters);

»Tjuvgömmare
!
»
säga
skatorna
och
se
ut
som
samvetet
självt
.
»Vi
äro


Pierre
Le 6 sept. 2010 à 20:55, Enrique Nell a écrit :

Hi

Have you tried adding binmode(STDOUT, ':utf8'); to your code?


Enrique


On Sep 6, 2010, at 8:08 PM, Pierre Nugues wrote:

Dear Karl,

Thank you for your help. I pasted the 'use utf8;' statement. However, this 
does not improve the output on my machine. I am using a Mac (10.6.4) with 
Perl 5.8.8 and 5.12.1. I pasted below the outputs I got. You can see that 
the accented characters are not processed and the quote is wrongly encoded 
by Perl.

Can you tell me what is your Perl version and if you are using Linux or a 
Mac, what is your locale? You get it with the command locale.

Thank you for your help!
Pierre

----
pierre:ch04 pierre$ perl token_perl_sav.pl extrait.txt 
?
Tjuvg
mmare
!
?
s
ga
skatorna
och
se
ut
som
samvetet
sj
lvt
.
?
Vi
ro
polisbetj
nter
,
vi
.
Hit
med
tjuvgodset
!
?
?
,
tyst
,
era
rackare
!
Jag
r
g
rdsfogden
.
?
?
Just
den
r
tta
!
?
h
na
de
.
---