On Fri, Jan 17, 2003 at 01:25:59PM -0000, J Dinesh wrote:
I am developing xml2xml conversion tool.
The XML document contains utf-8, symbol font and dingbats font character
value.
I need to convert UTF-8, symbol font and dingbats font to entity.
I did something similar for converting UTF-8 into HTML entities. I
decided to do it the "proper perl 5.8" way: make it into a plugin for
the wonderful Encode module.
What i did was:
- Write a simple script which reads the HTML entity definitions and
writes a .ucm file. See "perlodc enc2xs" for the format.
- Use enc2xs to turn it into a module.
- Just use Encode in any way you want to do the encoding. My use case
was also the reason that the fallback mechanisms to € and
  notation got added to Encode.
The whole thing is small enough that I'll just attach it. It contains
the HTML entity files and my script to parse them (ent2ucm). I did do
some manual editing of the result.
--
Bart.
Encode-HTMLEntities-0.01.tar.gz
Description: Binary data