On Monday, April 29, 2002, at 07:38 , SADAHIRO Tomoyuki wrote:
I doubt whether users of 'euc-jp' will
assume it to be a combination with JIS X 0213.
They don't have to because 'euc-jp' behaves exactly the same as before
so long as the charset is in ASCII/JISX(0201|0208|0212).
Such a mixing would prevent warning/croaking
for appearance of code points that are not defined
originally (meaning w/o X 0213), wouldn't it?
That was my biggest concern but I have decided to go ahead with euc-jp
to (partially) support JIS X 0213 and the reason is simple; Encode::JP
is already too big to differentiate between various euc-jp. In such
cases, we should settle for the most 'comprehensive' version.
Even the term 'euc-jp' is too ambiguous for many; At first it didn't
include G3 and some say they must be clearly marked as something like
'euc-jp-classic' (no 0212 support) vs 'euc-jp-modern' and so forth (then
our current euc-jp should be marked as 'euc-jp-postmodern' :). It would
be nice if we can go that way like 7bit-JIS/ISO-2022-JP/ISO-2022-JP-1
but for euc-jp we have to have a whole ucm for each.
This is definitely a todo for Perl 5.8.1 and up and I have already come
up with a solution; the future Encode (Encode II) will support
"CES-generator"; that is, you can express euc-jp not as a whole big
table but a combination of tables. That will also reduce the duplicates
found in vendor mappings. It will be a complete rewrite of encengine.c
But that requires not only codes but the expansion of UCM format so give
me more time (and Perl 5.8.0!)
Dan the Encode Maintainer