perl-unicode

Re: BOM and principle of least surprise

2004-05-11 09:30:05

Using a thing like utf8 to determine the encoding of character literals
is not a good idea. Suddenly someone saves the file in a different 
encoding, and guess what happens. And as long as Perl does not act
on byte-order marks, how would it be able to read a script that has
been saved in UTF16-LE, which is the normal way of saving Unicode data
on Windows?


I haven't tried this myself...

I just now tried and it seems that it's not as trivial as I thought...

We do have a test for UTF-16 detection in scripts, but the test seems to
be rather limited -- my simple first test of writing out a script in
UTF-16LE and then trying to run it with Perl didn't work :-(


I thought the issue was about Perl not automatically guessing the
UTF-16 encoding of input data.


That is a related but separate issue.