Dear Enrique,
This does not work either. Here are the results I get with the input:
»Tjuvgömmare!» säga skatorna och se ut som samvetet självt. »Vi äro
1/ What the correct output should be (I get this output when I remove the û
character from the tr// list):
Tjuvgömmare
!
säga
skatorna
och
se
ut
som
samvetet
självt
.
Vi
äro
2/ The output with "use utf8;" (the accented characters are not recognized and
the quotes are transformed into nonUTF-8 characters that correspond to <C2> in
Latin 1. The terminal can't interpret them and displays a question mark
instead):
?
Tjuvg
mmare
!
?
s
ga
skatorna
och
se
ut
som
samvetet
sj
lvt
.
?
Vi
ro
3/ With
use utf8;
binmode(STDOUT, ':utf8');
I get (this time, the terminal can display the <C2> as a Â. This is not
correct. It strips the accented characters):
Â
Tjuvg
mmare
!
Â
s
ga
skatorna
och
se
ut
som
samvetet
sj
lvt
.
Â
Vi
ro
4/ With binmode(STDOUT, ':utf8') only (Then, there is a combination of wrongly
coded quotes in Latin 1 or Latin 9 that the terminal displays and accented
characters that are shown with their UTF-8 substitutes interpreted as Latin 1
or Latin 9 characters);
»Tjuvgömmare
!
»
säga
skatorna
och
se
ut
som
samvetet
självt
.
»Vi
äro
Pierre
Le 6 sept. 2010 à 20:55, Enrique Nell a écrit :
Hi
Have you tried adding binmode(STDOUT, ':utf8'); to your code?
Enrique
On Sep 6, 2010, at 8:08 PM, Pierre Nugues wrote:
Dear Karl,
Thank you for your help. I pasted the 'use utf8;' statement. However, this
does not improve the output on my machine. I am using a Mac (10.6.4) with
Perl 5.8.8 and 5.12.1. I pasted below the outputs I got. You can see that
the accented characters are not processed and the quote is wrongly encoded
by Perl.
Can you tell me what is your Perl version and if you are using Linux or a
Mac, what is your locale? You get it with the command locale.
Thank you for your help!
Pierre
----
pierre:ch04 pierre$ perl token_perl_sav.pl extrait.txt
?
Tjuvg
mmare
!
?
s
ga
skatorna
och
se
ut
som
samvetet
sj
lvt
.
?
Vi
ro
polisbetj
nter
,
vi
.
Hit
med
tjuvgodset
!
?
?
,
tyst
,
era
rackare
!
Jag
r
g
rdsfogden
.
?
?
Just
den
r
tta
!
?
h
na
de
.
---