perl-unicode

Re: [Encode] euc-jp vs euc-jisx0213

2002-04-30 00:29:39
On Monday, April 29, 2002, at 07:38 , SADAHIRO Tomoyuki wrote:
I doubt whether users of 'euc-jp' will
assume it to be a combination with JIS X 0213.

They don't have to because 'euc-jp' behaves exactly the same as before so long as the charset is in ASCII/JISX(0201|0208|0212).

Such a mixing would prevent warning/croaking
for appearance of code points that are not defined
originally (meaning w/o X 0213), wouldn't it?

That was my biggest concern but I have decided to go ahead with euc-jp to (partially) support JIS X 0213 and the reason is simple; Encode::JP is already too big to differentiate between various euc-jp. In such cases, we should settle for the most 'comprehensive' version.

Even the term 'euc-jp' is too ambiguous for many; At first it didn't include G3 and some say they must be clearly marked as something like 'euc-jp-classic' (no 0212 support) vs 'euc-jp-modern' and so forth (then our current euc-jp should be marked as 'euc-jp-postmodern' :). It would be nice if we can go that way like 7bit-JIS/ISO-2022-JP/ISO-2022-JP-1 but for euc-jp we have to have a whole ucm for each.

This is definitely a todo for Perl 5.8.1 and up and I have already come up with a solution; the future Encode (Encode II) will support "CES-generator"; that is, you can express euc-jp not as a whole big table but a combination of tables. That will also reduce the duplicates found in vendor mappings. It will be a complete rewrite of encengine.c

But that requires not only codes but the expansion of UCM format so give me more time (and Perl 5.8.0!)

Dan the Encode Maintainer

<Prev in Thread] Current Thread [Next in Thread>