Re: BOM and principle of least surprise

perl-unicode

[Top] [All Lists]

Re: BOM and principle of least surprise

2004-05-11 09:30:05

from [Jarkko Hietaniemi]

[Permanent Link]

Using a thing like utf8 to determine the encoding of character literals
is not a good idea. Suddenly someone saves the file in a different 
encoding, and guess what happens. And as long as Perl does not act
on byte-order marks, how would it be able to read a script that has
been saved in UTF16-LE, which is the normal way of saving Unicode data
on Windows?



I haven't tried this myself...


I just now tried and it seems that it's not as trivial as I thought...

We do have a test for UTF-16 detection in scripts, but the test seems to
be rather limited -- my simple first test of writing out a script in
UTF-16LE and then trying to run it with Perl didn't work :-(

I thought the issue was about Perl not automatically guessing the
UTF-16 encoding of input data.



That is a related but separate issue.

[More with this subject...]

<Prev in Thread]	Current Thread	[Next in Thread>
Re: BOM and principle of least surprise, Paul Hoffman Re: BOM and principle of least surprise, Jarkko Hietaniemi Re: BOM and principle of least surprise, Paul Hoffman Re: BOM and principle of least surprise, Jarkko Hietaniemi Re: BOM and principle of least surprise, Paul Hoffman Re: BOM and principle of least surprise, Larry Wall Re: BOM and principle of least surprise, Nick Ing-Simmons Re: BOM and principle of least surprise, Jarkko Hietaniemi Re: BOM and principle of least surprise, Nick Ing-Simmons Re: BOM and principle of least surprise, Jarkko Hietaniemi <= Re: BOM and principle of least surprise, Erland Sommarskog Re: BOM and principle of least surprise, Jarkko Hietaniemi Re: BOM and principle of least surprise, Erland Sommarskog Re: BOM and principle of least surprise, Jarkko Hietaniemi Re: BOM and principle of least surprise, Erland Sommarskog Re: BOM and principle of least surprise, Jarkko Hietaniemi Re: BOM and principle of least surprise, Erland Sommarskog Re: BOM and principle of least surprise, Jarkko Hietaniemi Re: BOM and principle of least surprise, Nicholas Clark Re: BOM and principle of least surprise, Nick Ing-Simmons

Previous by Date:	Re: utf8, japanese, web-pages, the horror, the horror..., Marco Baroni
Next by Date:	utf8, japanese, web-pages: beginning to see the light..., Marco Baroni
Previous by Thread:	Re: BOM and principle of least surprise, Nick Ing-Simmons
Next by Thread:	Re: BOM and principle of least surprise, Erland Sommarskog
Indexes:	[Date] [Thread] [Top] [All Lists]