Jarkko:
It might be more useful if the default for the non-utf-8 characters
were the system-defined default character encoding of the process ...
I can understand the request but the problem is that for this to work
the legacy eight-bit mappings must first be implemented. ...
I understand. But it might be nice to define the API in terms of converting
from the system character encoding, even if the API only supports ISO Latin-1
in the first release.
Perhaps it would be difficult to use ICU as a utility ...
... The biggest problem is that
the ICU will not be everywhere.
Since it would be invisible outside of Perl internals, could it be an optional
component? In other words, you don't need it unless you care about character
encodings outside of a easy-to-implement base set.
But in any case, for the near term it might be useful to use their character
set name alias file "convrtrs.txt" and their mapping table plain-text 'source'
files (*.ucm).
"convrtrs.txt" is a useful alias list and is in an easy-to-process plain-text
format.
The mapping table files are useful because the selection is much larger than
the selection on unicode.org and they are all in the same format, unlike the
ones on unicode.org. Also, I believe them to be of high quality - at least as
good as the ones on unicode.org.
=Ed
--== Sent via Deja.com http://www.deja.com/ ==--
Before you buy.