Hi,
I'm new to this list, and I've tried searching the archives but I
couldn't find anything like this. I'm using Perl v5.8.7, and I'm
currently tearing my hair out trying to get the :encoding and :crlf
layers to play nicely with each other.
My problem is that I'm developing a system which, as part of its job,
needs to be able to read and write files in most encodings. I'm using
:encoding for this - so far, so good.
For readability and compatibility reasons, these files should have CR/LF
line endings, although this problem is equally applicable with or
without them. So, I figure the :crlf layer works for this.
Unfortunately, trying to get :crlf and :encoding to do the Right Thing
with each other seems to be like trying to pull hens' teeth. Here's an
example of what I was doing at first:
open(FILE, ">:crlf:encoding(UTF-8)", "some-file.txt");
All seemed to work fine, except until I tested outputting as UTF-16
instead of UTF-8 - at which point I discovered that the encoding layer
wasn't encoding the inserted CRs, and thus screwing up the UTF-16 file.
D'oh! Okay, so swap the layers:
open(FILE, ">:encoding(UTF-16):crlf", "some-file.txt");
Seems like everything should work there, but now I get problems trying
to print some characters. For example, trying to print a \x{A3} (a
British pound sign) results in:
"Malformed UTF-8 character (unexpected continuation byte 0xa3, with no
preceding start byte) in null operation at ./utf16-test.pl line 6."
...and the output file contains a null character where the sign should
be. Strangely, using a literal £ UTF-8 sequence (ie. C2 A3) in the Perl
file works fine. Here's the file that generates the above error:
---
#!/usr/bin/perl
open(FILE, ">:encoding(UTF-16):crlf", "test");
print FILE "Test \x{A3}45!\n";
print FILE "Test!\n";
close(FILE);
---
Yes, line 6 is the close() line. Removing :crlf from the layers fixes
the problem, so I'm wondering if this is a bug in the implementation of
:crlf. I'd really like to have some sort of transparent CR/LF conversion
though, as it makes things a lot easier.
Is this a known problem?
- Ciaran.