perl-unicode

Re: Extending the scope of a PERLIO Layer across packages

2004-07-20 12:30:11
Frank Krout <Frank(_dot_)Krout(_at_)eurorscg(_dot_)com> writes:
I'm trying to support a legacy multilingual website that has been upgraded
to perl58 and now using PERLIO to properly encode html output. (STDOUT is
mapped via binmode)


I have had this marked as needing a detailed/reasoned reply now for over a year,
so I guess you are not going to get one.

Here is a non-detailed one ;-)

Things are working as designed - IO layer stuff is lexical.
This wasn't easy to make happen, but is essential.
Perl has a tendancy to open files over time to AUTOLOAD things,
and even when AUTOLOAD as such isn't in use parts of perl itself
are loaded (e.g. the Unicode and encoding tables, and IO layers!).

To have "your" ShiftJIS encoding forced on someones foo.al 
or a SWASH table would likely break something.




In an effort to only change a few globally included packages, rather than
all the scripts or all the content, I like to find a way to associate a
charset (other than utf8) dynamically to all FILE IO when building web pages
from encoded templates.

The following works fine within the scope of the including package:

I suggest a perl script to add what works to all the scripts...
You can of of course make it a bit modular e.g.:

use OurStdOpen qw($encode_layers);
use open IN => $encode_layers;

Or (avoiding the pesky open pragma):

use OurStdOpen qw($encode_layers);
open FILE,"<$encode_layers","cp_email_JPJA_sendemail.txt.sjis";

or 

use OurStdOpen qw(our_std_open);
our_std_open FILE,"cp_email_JPJA_sendemail.txt.sjis";



use open IN => ":encoding(shiftjis)";
open FILE,"cp_email_JPJA_sendemail.txt.sjis";

or something like

open (FILE,"<:utf8","cp_email_JPJA_sendemail.txt.utf8");

but I'd like to do something like use open IN => ":encoding(shiftjis)"; in a
module that all application scripts and cgi's would include.

I've tried perllocale without success to at least produce UTF8 mapped file
handles unilaterally:

use  POSIX  qw (locale_h);
use locale;
setlocale(LC_ALL,"en_US.UTF-8");
$ENV{LC_ALL} = $ENV{LANG} = 'UTF-8';

From perlunicode.html:
Usually locale settings and Unicode do not affect each other, but there are
a couple of exceptions: 
*      If your locale environment variables (LANGUAGE, LC_ALL, LC_CTYPE,
LANG) contain the strings 'UTF-8' or 'UTF8' (case-insensitive matching), the
default encodings of your STDIN, STDOUT, and STDERR, and of any subsequent
file open, are considered to be UTF-8. 
*      Perl tries really hard to work both with Unicode and the old
byte-oriented world. Most often this is nice, but sometimes Perl's
straddling of the proverbial fence causes problems. 
As I said I can't seem to get this tweaked right to work. At least one could
standardize on utf8 content and then output in whatever encoding was
required.

Any thoughts would be greatly appreciated.


Here are system particulars:

Summary of my perl5 (revision 5.0 version 8 subversion 0) configuration:
 Platform:
   osname=solaris, osvers=2.6, archname=sun4-solaris-thread-multi
   uname='sunos seal 5.6 generic_105181-29 sun4u sparc sunw,ultra-5_10 '
   config_args='-Dcc=gcc -Dinstallprefix=/usr/local -Dprefix=/usr/local
-Dusethreads -des'
   hint=recommended, useposix=true, d_sigaction=define
   usethreads=define use5005threads=undef useithreads=define
usemultiplicity=define
   useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
   use64bitint=undef use64bitall=undef uselongdouble=undef
   usemymalloc=n, bincompat5005=undef
 Compiler:
   cc='gcc', ccflags ='-D_REENTRANT -I/usr/local/include
-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
   optimize='-O',
   cppflags='-D_REENTRANT -I/usr/local/include'
   ccversion='', gccversion='2.8.1', gccosandvers='solaris2.7'
   intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=4321
   d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
   ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t',
lseeksize=8
   alignbytes=8, prototype=define
 Linker and Libraries:
   ld='gcc', ldflags =' -L/usr/local/lib '
   libpth=/usr/local/lib /usr/lib /usr/ccs/lib
   libs=-lsocket -lnsl -ldl -lm -lposix4 -lpthread -lc
   perllibs=-lsocket -lnsl -ldl -lm -lposix4 -lpthread -lc
   libc=/lib/libc.so, so=so, useshrplib=false, libperl=libperl.a
   gnulibc_version=''
 Dynamic Linking:
   dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags=' '
   cccdlflags='-fPIC', lddlflags='-G -L/usr/local/lib'


Characteristics of this binary (from libperl): 
 Compile-time options: MULTIPLICITY USE_ITHREADS USE_LARGE_FILES
PERL_IMPLICIT_CONTEXT
 Built under solaris
 Compiled at May 16 2003 11:49:23
 %ENV:
   PERLLIB="/www/a/lib"
 @INC:
   /www/a/lib
   /usr/local/lib/perl5/5.8.0/sun4-solaris-thread-multi
   /usr/local/lib/perl5/5.8.0
   /usr/local/lib/perl5/site_perl/5.8.0/sun4-solaris-thread-multi
   /usr/local/lib/perl5/site_perl/5.8.0
   /usr/local/lib/perl5/site_perl


Thanks,

Frank K.

<Prev in Thread] Current Thread [Next in Thread>
  • Re: Extending the scope of a PERLIO Layer across packages, Nick Ing-Simmons <=