perl-unicode

Re: bareword test on ebcdic.

2005-07-26 08:48:23


--- Nicholas Clark <nick(_at_)ccl4(_dot_)org> wrote:

On Tue, Jul 26, 2005 at 08:12:16AM -0700, rajarshi
das wrote:

I basically want to know if there are alternate
ways
of representing barewords (as I mentioned in
question
2) above) ? 

No. By definition there can not be.
You're failing to grasp what is meant by "bareword".
There is only one representation.

Also, any pointers that you have regarding where
to
look to fix this ? 

Not much better than "in toke.c or utf8.c"

However, based on a comment I've spotted at the top
of utfebcdic.h *think*
that the internal encoding of perl on an EBCDIC
system is UTF-EBCDIC rather
than UTF-8. The byte sequence in the source file for
the bareword will need
to be valid UTF-EBCDIC.

For the code points being tested
("\x{0442}\x{0435}\x{0441}\x{0442}")
does the perl source file contain the correct byte
sequence in UTF-EBCDIC?
Yes it does, since I ran the test, 
if (($hash{"\x{0442}\x{0435}\x{0441}\x{0442}"}) eq
($hash{eval '"\x{0442}\x{0435}\x{0441}\x{0442}"'}))
print "ok\n";
and the test ran fine, if that is what you mean by the
source file containing the correct byte sequence. Or
am I mistaken ?


Does the byte sequence in UTF-EBCDIC for those 4
code points differ from the
byte sequence in UTF-8?

Yes the byte sequence for the 4 code points is
different on UTF-EBCDIC from the sequence in UTF-8.

Does the source file happen to have the UTF-8 byte
sequence?
It has the UTF-EBCDIC byte sequence on the ebcdic
platform.

If so, *that* would explain the failures, and be the
thing that needs
correcting. The test file would need if/else with a
different test on EBCDIC.
what would you suggest be put in the if/ else ?

Nicholas Clark

Thanks,
Rajarshi.




__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

<Prev in Thread] Current Thread [Next in Thread>