Re: Acceptance of Unicode (UTF8) in Far East

On Thursday, May 16, 2002, at 03:04 AM, Mark Lewellen wrote:

Hi all-
  I have a question directed mostly at those involved in the Far East.
Since Unicode is often implemented in UTF8, and UTF8 uses 3 bytes
for Chinese characters (instead of the 2 bytes in Chinese and Japanese
GB, Big5, JIS), UTF8 documents solely in these languages will be 50%
larger.  This appears to be a large stumbling block to universal
acceptance of UTF8.  Is there much resistance to UTF8 in the
Far East, are there work-arounds to the problem, and are many
people even aware of the problem?
Mark

Size of data is not a big deal these days with data compression andfaster network. So far as I see there are very few who dislike UTF-8because of the size bloats. Most of objections and dislikes againstUnicode is more of politics and culture.

Whether you like it or not, the Unicodization is steady because it isalready blessed by Windows and MacOS (X). And you have virtually nochoice but to use Unicode when you program in Java. But theUnicodization of applications have only begun. UTF-8 mails and webpages are still rare mainly because of lack of tools (well, as a matterof fact many of these tools do support Unicode but simple don't makeUTF-8 a default when it sends or saves data).

And even if tools are there it may still take a long time before dataget converted to UTF-8. Unless you need to save more than 3 languageslegacy encodings do suffice and many may still choose to save new datain legacy encodings for legacy applications.

To me it is okay whether you choose to save your data in whicheverencoding so long as I can read. That's why I became a maintainer ofEncode module, a standard part of Perl 5.8 that enables you to do so.


Dan the Encode Maintainer