I'll answer this one.
On 2002.02.02, at 03:28, Yves Arrouye wrote:
That is understandable if they use different tables. The question is
which
one is the "right" EUC-JP, and which one do users want? ICU, as well as
iconv, could have two tables with the different mappings. The question
then
is how to label them, and whether the labeling should be compatible
between
the two.
I don't know which one is 'right'. But most practical and widely-used
(euc-jp) is as follows;
\x00 - \x7f Maps to US-ASCII
\xa1a1 - \xfefe Maps to JISX-0208 (aka Zenkaku)
\x8ea1 - \x8edf Maps to JISX-0201 (aka Hankaku)
In addition, extended form of euc-jp also includes;
\x8fa1a1 - \x8ffefe Maps to JISX-0212
That's what iconv, Tcl's *.enc, and my humble Jcode think what euc-jp
is.
I find the same statement confusing. Are you saying that uconv's UTF-8
is
ill-formed? Nick, Would you mind email me (and just me, not the list)
your
table.euc sample file?
Go get Jcode.pm via http://search.cpan.org/search?dist=Jcode and check
under t/ directory. You can find table.euc and x0212.euc.
Dan