perl-unicode

Re: Encoding iso-8859-16

2005-08-19 04:31:18
Hi Nicholas

With reference to my previous mail on encoding module

use Encode;
$string = "a";
$enc_string = encode("iso-8859-16", $string);
print "\n String: $string\n";
print "\n enc_string: $enc_string\n";

a)How different are those ext/Encode/def_t.c and
ext/Encode/Byte/byte_t.c  files in EBCDIC and ASCII platforms?
b) Why is it when I copied the above .c files from ASCII platform to
EBCDIC worked for any codepage except  IBM-1047 codepage on EBCDCI
platform?

I stepped in the code and saw that in encengine.c the e->seq is
different on both the platforms. I guess that the structure is not
properly set. Please throw any thoughts you have!

-Sastry

On 8/9/05, Sastry <ravisastryk(_at_)gmail(_dot_)com> wrote:
Hi Nicholas Clark
 I agree that it is supposed to print the numerical equivalent 97.

I attempted to see if there is any bug in the encode module.
Surprisingly, I noticed  that there are two .c files in
ext/Encode/def_t.c and ext/Encode/Byte/byte_t.c which are generated
using enc2xs. They are different on EBCDIC platform and ASCII platform
like Linux.
I just replaced those files from linux  onto EBCDIC which  gave  the
expected result '97'
Please let me know if those .c files should be the same on both the platform!

-Sastry



On 8/9/05, Nicholas Clark <nick(_at_)ccl4(_dot_)org> wrote:
On Tue, Aug 09, 2005 at 10:58:48AM +0530, Sastry wrote:
Hi

I get 73 printed on EBCDIC platform.  I think it is supposed to print
129 as it is the numeric equivalent of 'a'.

-Sastry



On 8/8/05, Nicholas Clark <nick(_at_)ccl4(_dot_)org> wrote:

On your EBCDIC platform, what does this give?

It prints 73
use Encode;
$string = "a";
$enc_string = encode("iso-8859-16", $string);

print ord ($enc_string), "\n";

73. Odd.

It should print 97 on all platforms. Because:

$string contains 1 byte, the byte that represents 'a' in the platform's
default character encoding.

The encode call should convert from the default encoding to iso-8859-16
And 'a' in iso-8859-16 is 97.
Everywhere.

So $enc_string should be a single byte, 97, everywhere.

Nicholas Clark



<Prev in Thread] Current Thread [Next in Thread>