perl-unicode

Source data for perl encodings

2001-01-07 03:48:40
Keld,

As you may be aware we are adding suuport for UTF-8 encoded Unicode
to perl5. This is finally coming together. So now we need mechanism
to translate other encodings into and out of Unicode.

Initially I just grabbed what Sun/Scriptics/Ajuba/... had used for Tcl
(because it was to hand). I have also looked at GNU iconv, IBM ICU
and XFree86 4.*.
None so far has been ideal for embedding in perl itself. Either 
the origin is not documented, they come with extra things we do not 
need or are monolithic. 

I have a prototype of our own "engine" which can translate one 
single/multi-byte encoding to another but need good tables 
to drive it. 

So I have been looking for "authoritative" tables - and starting 
a web search from your name from rfc1345 came across:

ftp://dkuug.dk/cultreg
in particular
and then
ftp://dkuug.dk/i18n

The tables there seem to be suitable for my/our purposes.
So I have a few questions:

0. Is use/redistribution of these tables in OpenSource projects
   permitted?
1. Is the format formally defined anywhere?
   It seems straight forward enough.
2. Are the data actively maintained?
3. Are in cultreg and i18n charmaps "identical"

I also welcome suggestions as to other resources that may be 
available - particularly for asian encodings and IPA.

-- 
Nick Ing-Simmons

<Prev in Thread] Current Thread [Next in Thread>