Re: ICU's uconv vs Linux iconv and UTF-8

Marco,

  Thank you for elaborating my points.

On 2002.02.02, at 01:40, Marco Cimarosti wrote:

<< The entire former contents of this directory are obsolete and havebeen
moved to the OBSOLETE directory.  The latest information may be found
in the Unihan.txt file in the latest Unicode Character Database.
August 1, 2001. >>

And don't bother to download the 23 Mb
<http://www.unicode.org/Public/UNIDATA/Unihan.txt> file, because itcontains
only mappings for kanji's.

Yes. That's the point #0. Unihan.txt is no replacement forMAPPINGS. Maybe I can come up with a script which generates a table outof it but this kind of attitude is far from nice.

  And Unihan.txt also lacks 8bit mappings like JISX-0201.

So, go directly to
<http://www.unicode.org/Public/MAPPINGS/OBSOLETE/EASTASIA/>, where youcan
find the old data, along with a note about mapping errors:


  But this time, they are right about being OBSOLETE.

Below is some analysis by Asmus Freytag of specific problems raised byT.
Kubota in this document:
        http://www.debian.or.jp/~kubota/unicode-symbols.html


  English version also available as

        http://www.debian.or.jp/~kubota/unicode-symbols.html.en

  And let me quote the part which is significant.

ASCII and JIS X 0201 Roman
When converting EUC-JP and Shift_JIS, handling of 0x5c and 0x7e can bea problem. Since both encodings have long history and Japanese peoplehave lot of experience how to handle them, I now introduce it.
Solution is very simple. Just regard YEN SIGN and REVERSE SOLIDUS as adifferent glyphs of the same character. Then, distinction between ASCIIand JIS X 0201 Roman can be neglected.


  Has anyone of Unicode Consortium seen this one?

Dan

<Prev in Thread]

Current Thread

[Next in Thread>

Previous by Date:

Re: ICU's uconv vs Linux iconv and UTF-8, Dan Kogai

Next by Date:

Re: ICU's uconv vs Linux iconv and UTF-8, Mark Davis \(jtcsv\)

Previous by Thread:

RE: ICU's uconv vs Linux iconv and UTF-8, Marco Cimarosti

Next by Thread:

RE: ICU's uconv vs Linux iconv and UTF-8, Yves Arrouye

Indexes:

[Date] [Thread] [Top] [All Lists]