On Mon, 10 Dec 2001 16:49:18 +0000, jhiver(_at_)mkdoc(_dot_)com (Jean-Michel
Hiver)
wrote:
The way I got around this was to build a lossy table mapping
ISO-8859-15 to US ASCII, and then applying a few simple regexes so
that a sentence like "Le rêve du café" gets turned into
"le-reve-du-cafe"
Sounds a bit like Sean M. Burke's Text::Unidecode.
Ideally I would like to write a CPAN Unicode::Transliterate module
that could be modular enough to dynamically import transliteration
tables from any charset to any other charset, and eventually depending
on the language (for example, the japanese word 'roku' might actually
sound better if written 'lok' when read in French).
It doesn't directly support that, but you could overwrite some of the
translation table entries if you wanted to.
Cheers,
Philip