perl-unicode

Re: 5.8 roadmap and Encode

2002-03-01 08:52:13
Markus Kuhn <Markus(_dot_)Kuhn(_at_)cl(_dot_)cam(_dot_)ac(_dot_)uk> writes:
Jarkko Hietaniemi wrote on 2002-03-01 03:04 UTC:
On Thu, Feb 28, 2002 at 08:51:45PM +0200, Jarkko Hietaniemi wrote:
I think we should aim at the very last to keep up with
a Certain Language:

http://java.sun.com/j2se/1.4/docs/guide/intl/encoding.doc.html

Based on a quick check, we are missing the following compared
with the J2SE 1.4 list:

Not all, but many of the charsets on this list are bogus or at least
hardly ever needed. They exist only on paper in some obscure IBM
document. Please don't start an I-have-more-encodings-than-you-do war.
There are less than ~30 encodings commonly used today.

I would appreciate a list of them.


If an encoding is neither mentioned in

 http://www.unicode.org/Public/MAPPINGS/

or in the MIME registry

That one has been _my_ particular "must have" list.

or in case it has Han characters in

 http://www.unicode.org/Public/UNIDATA/Unihan.txt

then chances are good that nobody actually needs it in a standard
library. Lack of showing up in these registries means that there isn't a
real-world user community who cares about that encoding.

Too long lists of encodings just confuse users, unless each encoding
comes with an abstract that clarifies where this conversion table came
from and how it differs from similar ones.

Lack of a lengthy list of bogous encodings should not be a reason for
delaying the release of 5.8, imho ...

Amazingly, Unicode has triggered a strange interest in lots of long
forgotten an never used encodings, because only now, since conversion
between everything becomes technically possible, people all of a sudden
develop strong hunter-and-gatherer instincts to find ever more and more
Unicode conversion tables. Just the opposite of what Unicode was about,
isn't it?

Markus
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/



<Prev in Thread] Current Thread [Next in Thread>