perl-unicode

Re: CGI and UTF

2003-01-05 09:30:04
On Sun, 5 Jan 2003, Jarkko Hietaniemi wrote:
I repeat: all your filehandles are still 'binary' unless you either
explicitly (binmode)

Fine.

or implicitly (locale) command them not be.

Not fine without a warning. This is 'action at a distance' (this is the
same reason un'local'ized usage of the 'special' variables is nearly
always a Bad Idea (tm)). It causes breakage that can be hard to find the
cause of. Perl needs a mandatory warning if the locale changes my
filehandles to text mode and I haven't made some kind of _explicit_
declaration that I want that behavior to happen.

The change is of a bad 'type': An incompatible change in Perl semamtics
without so much as a warning being issued by either the compiler or the
runtime - except to make the code fall over dead many lines away from the
actual breakage. If the string is invalid UTF8, why didn't Perl complain
_when I read it_ instead of dozens of lines away when I tried to use that
string for something else? That is _broken_.

If you try to push Unicode (data marked as UTF-8, such as characters
beyond 255) on such a filehandle, you'll get 'Wide character' warning.

But it _reads_ binary data through a UTF8 layer silently. No warnings. Try
the code I posted on an actual jpg file with UTF-8 local set in the
environment. The first complaint is when the code falls over dead in the
'jpegsize' sub - many lines of code away from the <fh> read.

-- 
Jerry

"If the code and the comments disagree, then both are probably wrong."
                                        -- Norm Schryer, Bell Labs 


<Prev in Thread] Current Thread [Next in Thread>