Dan Kogai <dankogai(_at_)dan(_dot_)co(_dot_)jp> writes:
Dan, in case EBCDIC scares you (and it should :-), a quick intro:
basically, consider the whole low 256 characters being rearranged from
what they are in ASCII. For example, ord("A") is 0xC1, not 0x41. (The
pod/perlebcdic.pod has the full tables.)
Sure it does scare me. I have to confess UTF-EBCDIC was totally out
of mind. But here I got a hint; Like what perl used to be, CJK
encodings are very, very ASCII-chauvinistic; Its variable-length
encoding heavily relies on the fact that ascii leaves MSB of the byte
alone. That way you can tell if a given byte is a whole (half-width)
character or half of full-width character.
That is fine. When in the CJK codings they can stay ASCII_oid.
The problem comes when we convert to perl's internal form.
An ASCII 'A' in shift-JIS or whatever will still become 0xC1 in
an EBCDIC perl because that is "defined" to be EBCDIC perl's
view of U+0041.
So if tests convert CJK into "internal" and then just do ord()
they will fail for range 0..255. There are some XS functions
to map native<->unicode numbers.
The shadow of ASCII casts even on ISO-2022, an escape-based encoding
that is not supposed to be affected by MSB and such (Only \e was
supposed to matter); in ISO-2022, most 2-byte characters are
represented by either 96x96 or 94x94 grid, which is (7bit ascii -
control characters) or (that - space (0x20) and DEL (\x7F)).
Obviously this will not work on EBCDIC....
Nor should it.
This one may be tougher than we think....
FYI I know something called 12-bit EBCDIC kanji also exists. I know
only of existence but is that in our support list?
If OS390 (or ICU given its history) has tables we can probably support
them.
The test logs are attached: I would really appreciate if you could see
some pattern in the failures.
I will do the best I can but I will be away for this weekend and I
won't be back online till Sunday at least.
--
$jhi++; # http://www.iki.fi/jhi/
# There is this special biologist word we use for 'stable'.
# It is 'dead'. -- Jack Cohen
Dan the Unstable according to Jack Cohen
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/