Thank you, that was very helpful :)
-----Original Message-----
From: Bjoern Hoehrmann [mailto:derhoermi(_at_)gmx(_dot_)net]
Sent: Sunday, September 12, 2004 10:09 AM
To: Burak Gürsoy
Cc: perl-unicode
Subject: Re: Encode vs encoding
* Burak Gürsoy wrote:
Can someone *please* explain me the
difference between (except the scope) encoding and Encode::encode()?
encoding.pm is about how Perl should interprete your source file,
Encode.pm is about character encoding operations you may wish to
perform.
#!/usr/bin/perl -w
use strict;
my $char = "\xFE";
print ord $char; # prints 254
Perl assumes that $char is ISO-8859-1 encoded.
#!/usr/bin/perl -w
use strict;
use Encode;
my $char = "\xFE";
$char = encode 'ISO-8859-9', $char;
print ord $char; # prints 63
As above, U+00FE is not available in ISO-8859-9
and thus replaced by a question mark.
#!/usr/bin/perl -w
use strict;
use Encode;
my $char = "\xFE";
$char = encode 'ISO-8859-9', $char, Encode::FB_CROAK();
# dies with: "\x{00fe}" does not map to iso-8859-9
print ord $char;
As above, just that it does not replace the
offending character but croaks instead.
#!/usr/bin/perl -w
use strict;
use encoding 'ISO-8859-9';
my $char = "\xFE";
print ord $char; # prints 351
You've told Perl to consider the source ISO-8859-9
encoded which includes some interpretation of strings
such as your $char.
How can I get that "351" with Encode.pm?
You need to decode the binary string into a character
string using e.g. the Encode::decode routine, e.g.
perl -MEncode -e "print ord decode 'iso-8859-9'=>qq(\xFE)"
Note: If I use the letter version (small s with a dot under it "?")
instead of the "\x" escape, I get the same results...
Due to the same reasons.