perl-unicode

Re: Use of encoding/decoding and 3-param open

2007-11-13 11:55:07
Name is \x{c384}stan\x{c3a5} Bruk AB\n   UTF String:Name is ì

U+C384 and U+C3A5 are most definitely not what you're after.

The unicode codepoint U+00C4 (LATIN CAPITAL LETTER A WITH DIAERESIS) is
the two bytes C3 and 84 when encoded to UTF-8, but you should never need
to manually enter this. \x in Perl takes codepoint numbers, and C384 is
not the codepoint for the character that you want.

Likewise, the codepoint U+00E5 (LATIN SMALL LITTER A WITH RING ABOVE) is
not at all like U+C3A5, even though the UTF-8 encoding is C3 A5.

Please do yourself a big favor and learn about the difference between
Unicode and UTF-8.

(real name in string is 'Östanå Bruk AB'  (Swedish...)
What am I doing wrong?

Well, "use encoding" at least. Remove it. You don't need it, and it's
broken anyway.

use FileHandle;

FileHandle is superseded by IO::Handle.

my $utfstring = "Name is \x{c384}stan\x{c3a5} Bruk AB\n";
print 'Name is \x{c384}stan\x{c3a5} Bruk AB\n' . "   UTF String:$utfstring";

Just wrong. These numbers are not the right codepoints, and the result
is a Unicode string, not a utf8 string, so the name "utfstring" is bad
too.

my $namedstring = "Name is \N{LATIN CAPITAL LETTER A WITH 
DIAERESIS}stan\N{LATIN SMALL LETTER A WITH RING ABOVE} Bruk AB\n";

This results in a proper Unicode string.

print 'Name is \N{LATIN CAPITAL LETTER A WITH DIAERESIS}stan\N{LATIN SMALL 
LETTER A WITH RING ABOVE} Bruk AB' . "  Named String: $namedstring";

You shouldn't print text without specifying the output encoding. Match
the encoding of your terminal for correct display. e.g.

    binmode STDOUT, ":encoding(UTF-8)";

$rv = open (OUT1, ">", "sample1");

Encodingless open is not suited for text output.

$rv = open (OUT2, ">:utf8", "sample2");

Should work well. Remember that you shouldn't use :utf8 for input. In
the general case, :encoding(UTF-8) is safest.

$rv = open (OUT3, '>:encoding(iso-8859-1)', 'sample3');

Good.

print OUT3 $namedstring;

Also good. Does this not work as expected?
-- 
Met vriendelijke groet,  Kind regards,  Korajn salutojn,

  Juerd Waalboer:  Perl hacker  <#####(_at_)juerd(_dot_)nl>  
<http://juerd.nl/sig>
  Convolution:     ICT solutions and consultancy 
<sales(_at_)convolution(_dot_)nl>

<Prev in Thread] Current Thread [Next in Thread>