Re: Unicode aware module

The original case I wanted to apply this to was the
HTML::Entities::decode() function in one of my modules.  It "expands"
entities like "&aring;", "&#233;" and "&#x263A;" to the actual
character.

What I wanted this function to do was to only expand numerical
entities greater than 255 when the caller was in utf8 context.  If not
it should just leave them alone. For entities in the 128-255 range I
wanted them to expand to 2-byte utf8 chars when the caller was in utf8
context, but plain latin1 chars when it was not.  Likewise for named
entities corresponding to chars outside latin1 range.

The HTML::Entities::encode() function should also work on utf8 strings
when caller is in utf8 context, but assume latin1 strings otherwise.

Another thing I don't really get is how to write an unicode aware
module that still works with perl5.004?

Regards,
Gisle

<Prev in Thread]	Current Thread	[Next in Thread>
Re: Unicode aware module, (continued) Re: Unicode aware module, Dick Hardt Re: Unicode aware module, Tim Bunce Re: Unicode aware module, Dick Hardt Re: Unicode aware module, Tim Bunce Re: Unicode aware module, Dick Hardt Re: Unicode aware module, Nick Ing-Simmons Re: Unicode aware module, Dick Hardt Re: Unicode aware module, Gurusamy Sarathy Re: Unicode aware module, Tim Bunce Re: Unicode aware module, Nick Ing-Simmons Re: Unicode aware module, Gisle Aas <= Re: Unicode aware module, Benjamin Holzman

Previous by Date:	Re: Unicode aware module, Dick Hardt
Next by Date:	Re: Unicode aware module, Benjamin Holzman
Previous by Thread:	Re: Unicode aware module, Nick Ing-Simmons
Next by Thread:	Re: Unicode aware module, Benjamin Holzman
Indexes:	[Date] [Thread] [Top] [All Lists]