perl-unicode

Re: BOM and principle of least surprise

2004-05-16 16:30:05
Jarkko Hietaniemi (jhi(_at_)iki(_dot_)fi) writes:
To be able to that, it would have have to understand byte-order marks
(which it doesn't). I think there was a suggestion that you could
specify an 

In 5.8.5 it will.


Will such an option include the possibility to say that I want Perl to
determine the encoding from the byte-order mark?

No.  The patch I submitted peeks at the beginning of a Perl script and
if it either sees a BOM or something that looks like raw BOMless UTF-16
(every other byte zero, every other not) of either endianness, Perl will
understand.

I think I understood that the change was only for the script as such. Let's
forget input files for the moment.

So Perl 5.8.5 will be able to read a UTF-16 file?

And if it sees a UTF-8 BOM, that will imply a "use utf8"?
 
Will this require that I specify a an option to Perl, or will this be 
the default behaviour?

-- 
Erland Sommarskog, Stockholm, sommar(_at_)algonet(_dot_)se