perl-unicode

Re: Source data for perl encodings

2001-01-08 09:32:15

Mark Leisher <mleisher(_at_)crl(_dot_)nmsu(_dot_)edu> writes:

    Ed> I haven't seen your engine, but I've created such engines and worked
    Ed> on engines others have created. They aren't easy to do. Just
    Ed> supporting a basic set of Internet encodings will be a difficult
    Ed> undertaking. Maybe you've got it all worked out, in which case hats
    Ed> off to you. Otherwise I would recommend a strategy whereby you
    Ed> implement a core set of single-byte conversions (for, say, Western
    Ed> Europe, Central Europe, and Cyrillic languages) with an internal
    Ed> engine and plan to incorporate an optional ICU hook-up for anything
    Ed> else. That way you don't have to maintain and distribute large Asian
    Ed> encoding tables.

What would be nice is some variation of Bruno Haible's libiconv that allows
dynamic loading of mapping tables.  Then Perl would have reasonable conversion
capability at about 1/16 the size of ICU.

Well, I'd go beyond this and say that it would be nice if Perl would use
the system iconv when available - iconv isn't the greatest interface,
but it is generally pretty workable, and if people use the system 
capabilities, then you avoid an explosion of tables.

The question here, of course, is whether:

 - Perl behaving the same Perl on every system

or 

 - Perl behaving the same as other programs on a given system

Is more important. It might be necessary to only use the system iconv
on a known set of systems; I believe the GNU iconv data is kept in
sync with libiconv.

Regards,
                                        Owen