Erland Sommarskog schrieb am 17.01.2011 um 13:57 (-0000):
I'm on Windows and I have this small script:
use strict;
open F, '>:encoding(UTF-16LE)', "slask2.txt";
print F "1\n2\n3\n";
close F;
When I open the output in a hex editor I see
31 00 0D 0A 00 32 00 0D 0A 00 33 0D 0A 00
In other words (od -c):
1 \0 \r \n \0 2 \0 \r \n \0 3 \0 \r \n \0
I would expect to see:
31 00 0D 00 0A 00 32 00 0D 00 0A 00 33 0D 00 0A 00
Guess you would even expect:
… 33 00 OD 00 OA 00
That is, I expect \n to be translated to 0D 00 0A 00, now it is
translated to three bytes.
It looks like a bug to me. I'm getting the same result as you for:
* ActivePerl 5.10.1
* ActivePerl 5.12.1
* Strawberry 5.12.0
All three participants show correspondingly wrong results for UTF-16BE.
And also for UTF-16, which just adds the BOM.
Perl/Cygwin 5.10.1 does fine because its OS is "cygwin", so it doesn't
translate "\n" to CRLF.
--
Michael Ludwig