Mark Leisher <mleisher(_at_)crl(_dot_)nmsu(_dot_)edu> writes:
Ed> I haven't seen your engine, but I've created such engines and worked
Ed> on engines others have created. They aren't easy to do. Just
Ed> supporting a basic set of Internet encodings will be a difficult
Ed> undertaking. Maybe you've got it all worked out, in which case hats
Ed> off to you. Otherwise I would recommend a strategy whereby you
Ed> implement a core set of single-byte conversions (for, say, Western
Ed> Europe, Central Europe, and Cyrillic languages) with an internal
Ed> engine and plan to incorporate an optional ICU hook-up for anything
Ed> else. That way you don't have to maintain and distribute large Asian
Ed> encoding tables.
What would be nice is some variation of Bruno Haible's libiconv that allows
dynamic loading of mapping tables. Then Perl would have reasonable conversion
capability at about 1/16 the size of ICU.
Well, I'd go beyond this and say that it would be nice if Perl would use
the system iconv when available - iconv isn't the greatest interface,
but it is generally pretty workable, and if people use the system
capabilities, then you avoid an explosion of tables.
The question here, of course, is whether:
- Perl behaving the same Perl on every system
or
- Perl behaving the same as other programs on a given system
Is more important. It might be necessary to only use the system iconv
on a known set of systems; I believe the GNU iconv data is kept in
sync with libiconv.
Regards,
Owen