perl-i18n

Re: how to use int'l data in perl scripts?

2002-03-01 09:25:30
I don't know the details of what you're trying to do, so I'll just tell you what I have done to deal with multiple character sets. It may help you; it may not.

If all you want to do is recognize and save/pass along the strings in other character sets, try replacing
use utf8;
with
use bytes;
"use bytes;" works for utf8 strings as well as strings in other character sets.

I think it is best to perform regular expressions on UTF-8 strings - then you can use general property classes such as \p{IsAlpha}. For these types of regular expressions I switch to "use utf8;" for that one statement and then switch back to "use bytes;".

I use Text::Iconv for transferring data between UTF-8 and other character sets. With the project I'm working on, we always do our processing in UTF-8 and transfer to/from other character sets only for saving and returning data, and only when absolutely necessary (e.g. HTTP file downloads to OS's that only understand certain character sets for file names and file contents). We want our inner modules to be as generic as possible, and UTF-8 solves our problem better than anything else - since it handles all languages. Some people might think this is too much work, but for our complex framework it's the only way it will work.

For web forms: in order to always get UTF-8 from form posts, we display our web pages in UTF-8 and use the following <meta> tag in the <head> tag.
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">

Thanks,
Mary

Mags Doheny wrote:

Hi,

i need to get my perl scripts to recognize strings encoded in other
charsets; the utf8 pragma does the trick for unicode; does anyone know
of other pragmas available for, say, the iso-8859-x charsets?

Thanks,
Mags/




<Prev in Thread] Current Thread [Next in Thread>