At 10:31 am -0700 23/6/06, Jianyang Tai wrote:
I encountered some problem with the Encode module when I convert
some Japanese contents from shift-jis to utf-8. Basically I am using
the from_to subroutine to do the job. All work well except for those
number inside a circle characters (8740 ~ 8754). The unicode range
for those characters is 2460 ~ 2473. However, the from_to doesn't
convert them correctly. For 8740 (1 inside a little circle), what I
got was "FFFD 0040".
Does anyone have any idea what the problem is? Is this a known issue
or there is something wrong with the original shift-jis text? Any
advise is very appreciated.
Those characters do not exist in shift-jis but only in GB18030 and in
the MacOS Japanese, Korean and Chinese (both) character sets.
JD