One Small Doubt
The only area of doubt I have about this problem being caused by the base Perl
and it configuration results from having the MIME::Lite and MIME::Base_64
modules available. Both of these I would expect to have access to the encode
features but neither are used in this code module. They are used in other
modules elsewhere on the CGI but no connection to the troublesome module.
The Pound Sterling
The pound is defiantly odd; from memory the PC originally allowed the £ to
replaced the # and you could have one or the other. Then codepage 850 changed
things so you could have both and the pound moved to 0xA3 in the range beyond
the ASCII defined characters. A somewhat checkered history.
Now in my Red Hat environment "LANG=en_GB.UTF-8" is set and I think this is
causing Perl to render the £ in a two byte format 0xC2A3 however in the source
the one byte 0xA3 is used and understood. So the input/source is not encoded
but the output is encoded; I don't really understand, why?
Equally so far the £ seems to be the only character effected in this way.
However, now I have the no encoding; pragma in force everything is rendered as
one byte characters.
I love Perl but I am not sure that this part is very transparent. I would have
expected the norm to follow the input/source and only do translation on
instruction. Equally as the use byte; pragma is supposed to force characters to
be rendered as "almost binary" I expected it to stop the two byte rendering.
I think this area of 5.8 whilst better than 5.6 may still need some
clarification before the average user can understand it easily.
Frank
John Delacour <JD(_at_)BD8(_dot_)COM> 10/10/03 00:25:07 >>>
At 4:05 pm +0100 9/10/03, Frank Smith wrote:
I have now forced Perl to prodcue uncoded output by the use of:
no coding;
which has worked wonders.
no encoding, I presume you mean. That makes no difference here.
On the other hand if I run this
use encoding "utf8", STDOUT => "MacRoman" ;
print "\x{2022}" ;
I get the one-byte Mac bullet instead of the
three-byte utf8 character I would get with just
print "\x{2022}" ;
There seems to be something odd about the "£".
Perl on my machine prints it in one byte whatever
I do. Maybe something to do with locale settings.
JD
***********************************************************************
This transmission contains information which may be confidential and
which may also be privileged. It is intended for the named addressee
only. Unless you are the named addressee, or authorised to receive it
on behalf of the addressee you may not copy or use it, or disclose it
to anyone else. If you have received this transmission in error please
contact the sender. Thank you for your cooperation.
***********************************************************************
For more information about AEA Technology please visit our website at
http://www.aeat.co.uk
AEA Technology plc registered office 329 Harwell, Didcot, Oxfordshire OX11 0QJ.
Registered in England and Wales, number 3095862.