perl-unicode

Re: Encode's .enc files and a question

2000-10-25 04:20:12
On Tue, 24 Oct 2000, Peter Prymmer wrote:

I am curious about the viability of an EBCDIC based .enc file so
I took the Encode/iso8859-1.enc and came up with one that I
might call Encode/cp1047.enc.  Would this be the correct form/format?
If so I can prepare this and a cp37.enc and a posix-bc.enc file
as well:

# Encoding file: cp1047, single-byte
S
003F 0 1
00
00000001000200030037002D002E002F001600050015000B000C000D000E000F
0010001100120013003C003D0032002600180019003F0027001C001D001E001F
0040005A007F007B005B006C0050007D004D005D005C004E006B0060004B0061
00F000F100F200F300F400F500F600F700F800F9007A005E004C007E006E006F
007C00C100C200C300C400C500C600C700C800C900D100D200D300D400D500D6
00D700D800D900E200E300E400E500E600E700E800E900AD00E000BD005F006D
0079008100820083008400850086008700880089009100920093009400950096
00970098009900A200A300A400A500A600A700A800A900C0004F00D000A10007
0020002100220023002400250006001700280029002A002B002C0009000A001B
00300031001A0033003400350036000800380039003A003B00040014003E00FF
004100AA004A00B1009F00B2006A00B500BB00B4009A008A00B000CA00AF00BC
0090008F00EA00FA00BE00A000B600B3009D00DA009B008B00B700B800B900AB
006400650062006600630067009E006800740071007200730078007500760077
00AC006900ED00EE00EB00EF00EC00BF008000FD00FE00FB00FC00BA00AE0059
004400450042004600430047009C004800540051005200530058005500560057
008C004900CD00CE00CB00CF00CC00E1007000DD00DE00DB00DC008D008E00DF

I didn't read up on the format, but I would gess that this maps from
EBCDIC position to Unicode in this way: take the EBCDIC code point and
treat it as an index into an array of four-character Unicode code points.
In which case, your table looks rather unlikely, since the last line
should then start "0030003100320033" -- that is, F0 .. F9 should map to
U+0030 .. U+0039, the digits.

I don't remember the code points for letters, but I'm fairly sure the
digits fall in the range F0 .. F9 in all flavours of EBDIC. You have
U+0031 at position 90.

Cheers,
Philip
-- 
Philip Newton <newton(_at_)newton(_dot_)digitalspace(_dot_)net>