Re: Troubles with UTF-8

--On 23. desember 2005 11:36 +0100 "Tom.Petch" <sisyphus(_at_)dial(_dot_)pipex(_dot_)com>wrote:

A) Character set.  UTF-8 implicitly specifies the use of Unicode/IS10646
which contains 97,000 - and rising - characters.  Some (proposed)
standards limit themselves to 0000..007F, which is not at all
international, others to 0000-00FF, essentially Latin-1, which suits many
Western languages but is not truly international.  Is 97,000 really
appropriate or should there be a defined subset?

I think Ned has answered most of your other points... I'll chime in on thisone.....

My opinion: ALL attempts at defining an "useful" character set of any sizebetween 128 and "all you can eat" for use internationally have been dismalfailures. They have been used in some niche, sooner or later there's a needto work outside that box, and gateways or other forms of self-tortureresult. (Alvestrand's equality: gateways = pain).

At the moment, the only reasonable candidate for an "all you can eat"character set is the Unicode charset. All other alternatives, including thebizarrely byzantine character set switching schemes of ISO 2022, arebasically dead in the marketplace.


So there are only two real choices for charset left: ASCII and Unicode.

ASCII is unsuitable for any language except the technologists' simplifiedversion of English. So if you want text, and want it to workinternationally, there's only one choice left.


Subsets are a mistake.

                           Harald

pgpkg8DJTaeVB.pgp
Description: PGP signature

_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/ietf

<Prev in Thread]	Current Thread	[Next in Thread>
Re: Troubles with UTF-8, (continued) Re: Troubles with UTF-8, Randy Presuhn Re: Troubles with UTF-8, Frank Ellermann Re: Troubles with UTF-8, Tom.Petch Re: Troubles with UTF-8, Masataka Ohta Re: Troubles with UTF-8, Tom.Petch Re: Troubles with UTF-8, James Cloos Re: Troubles with UTF-8, Tim Bray Re: Troubles with UTF-8, Frank Ellermann Re: Troubles with UTF-8, Ned Freed Re: Troubles with UTF-8, Tom.Petch Re: Troubles with UTF-8, Harald Tveit Alvestrand <= Re: Troubles with UTF-8, JFC (Jefsey) Morfin