On Sun, Jan 07, 2001 at 10:46:02AM +0000, nick(_at_)ing-simmons(_dot_)net wrote:
Keld,
As you may be aware we are adding suuport for UTF-8 encoded Unicode
to perl5. This is finally coming together. So now we need mechanism
to translate other encodings into and out of Unicode.
I was not aware of that. Could you give me a pointer to the spec?
Do you mean unicode or do you mean ISO 10646?
Initially I just grabbed what Sun/Scriptics/Ajuba/... had used for Tcl
(because it was to hand). I have also looked at GNU iconv, IBM ICU
and XFree86 4.*.
None so far has been ideal for embedding in perl itself. Either
the origin is not documented, they come with extra things we do not
need or are monolithic.
I have a prototype of our own "engine" which can translate one
single/multi-byte encoding to another but need good tables
to drive it.
So I have been looking for "authoritative" tables - and starting
a web search from your name from rfc1345 came across:
ftp://dkuug.dk/cultreg
in particular
and then
ftp://dkuug.dk/i18n
The tables there seem to be suitable for my/our purposes.
So I have a few questions:
0. Is use/redistribution of these tables in OpenSource projects
permitted?
Yes, they are
1. Is the format formally defined anywhere?
It seems straight forward enough.
The format is defined in the POSIX-2 standard ISO/IEC 9945-2:1993.
(Aka IEEE 1003.2).
2. Are the data actively maintained?
Yes, by me, and submissions I get. I am a little slow at times, tho.
3. Are in cultreg and i18n charmaps "identical"
No, i18n are more up to date. But cultreg are official ISO.
They are very syncronized, however.
I also welcome suggestions as to other resources that may be
available - particularly for asian encodings and IPA.
I do not have a good suggestion for asian encodings and IPA.
Kind regards
keld