perl-unicode

Re: How to handle unicode strings in utf8 and pre-utf8 pragma perls

2003-05-31 03:30:05

If I understand Nicholas Clark's suggestion, it would mean that for any 
perl version prior to 5.8.0, the script won't compile unless "if.pm" 
has been installed from CPAN.

The fact that "if.pm" exists and is usable on older perl5 versions is
really good news, but it still might be a hurdle for some users who
depend on remote web-server sys-admins (or other uncontrollable forces)
for perl support...

In any case, one work-around for handling utf8 text in a version-neutral 
way would be to store this text in a file, not hard-coded into the perl 
script; then decide how to read the file, depending on the version; e.g.

 open( DAYS, "day_names.utf8" );
 binmode( DAYS, ":utf8" ) if ( $] >= 5.008 );
 @day_names = <DAYS>;
 close DAYS;

Depending on what you do with the data elsewhere in your script, I'm not
sure whether 5.6 will treat the data as utf8 characters when read from 
a file like this (5.6 does not support "binmode ':utf8', FH"), but 
there's a good chance that it will work.

You can also attach this text content at the end of your script, in a 
__DATA__ segment, and set DATA as the file handle in the code sample 
shown above (rather than DAYS).

Of course even using __DATA__, it can get tedious and hard to maintain
if you have a lot of little string constants scattered throughout.

        Dave G.

(P.S.: for some reason, three of the characters in your first string
didn't map to proper Cyrillic code points for me: \u04e9 and the two
occurrences of \u04af -- I don't know the language, but were those 
typos?)