perl-unicode

RE: Encode vs encoding

2004-09-12 01:30:05
Thank you, that was very helpful :)

-----Original Message-----
From: Bjoern Hoehrmann [mailto:derhoermi(_at_)gmx(_dot_)net]
Sent: Sunday, September 12, 2004 10:09 AM
To: Burak Gürsoy
Cc: perl-unicode
Subject: Re: Encode vs encoding


* Burak Gürsoy wrote:
Can someone *please* explain me the
difference between (except the scope) encoding and Encode::encode()?

encoding.pm is about how Perl should interprete your source file,
Encode.pm is about character encoding operations you may wish to
perform.

#!/usr/bin/perl -w
use strict;
my $char = "\xFE";
print ord $char; # prints 254

Perl assumes that $char is ISO-8859-1 encoded.

#!/usr/bin/perl -w
use strict;
use Encode;
my $char = "\xFE";
  $char = encode 'ISO-8859-9', $char;
print ord $char; # prints 63

As above, U+00FE is not available in ISO-8859-9
and thus replaced by a question mark.

#!/usr/bin/perl -w
use strict;
use Encode;
my $char = "\xFE";
  $char = encode 'ISO-8859-9', $char, Encode::FB_CROAK();
# dies with: "\x{00fe}" does not map to iso-8859-9
print ord $char;

As above, just that it does not replace the
offending character but croaks instead.

#!/usr/bin/perl -w
use strict;
use encoding 'ISO-8859-9';
my $char = "\xFE";
print ord $char; # prints 351

You've told Perl to consider the source ISO-8859-9
encoded which includes some interpretation of strings
such as your $char.

How can I get that "351" with Encode.pm?

You need to decode the binary string into a character
string using e.g. the Encode::decode routine, e.g.

  perl -MEncode -e "print ord decode 'iso-8859-9'=>qq(\xFE)"

Note: If I use the letter version (small s with a dot under it "?")
instead of the "\x" escape, I get the same results...

Due to the same reasons.

<Prev in Thread] Current Thread [Next in Thread>