perl-unicode

use open ":locale" causes "-|" to ignore explicit encodings

2016-05-26 01:45:07
Hi! I had some encoding (actually decoding) issues when trying to read
from a pipe, and I would like to check with some more experienced
users if I'm doing something wrong, if it's intended behavior, or if
it's a problem in perl.

The problem is that if I use "open ':locale'", the explicit encoding
in the following code is ignored:
   open FH, '-| :encoding(ISO-8859-1)', 'cat', 'testfile';


Simplified but working test code to reproduce the issue:
------------------------------------------
1  use open ':std', ':locale';
2  open FH1, '<  :encoding(ISO-8859-1)', 'testfile';
3  open FH2, '-| :encoding(ISO-8859-1)', 'cat', 'testfile';
4  <FH1>;
5  <FH2>;
------------------------------------------

Running this code when 'testfile' contains ISO-8859-1 characters
causes line 5 to emit warnings about not being able to map to Unicode.
The "<" version of the open is to demonstrate that the encoding works
there. It correctly opens the file in ISO-8859-1 mode.


The following code works:
------------------------------------------
1  open FH2, '-| :encoding(ISO-8859-1)', 'cat', 'testfile';
2  <FH2>;
------------------------------------------

To show that a pipe-open normally accepts the encoding layer, the
following code breaks:
------------------------------------------
1  open FH2, '-| :encoding(UTF-8)', 'cat', 'testfile';
2  <FH2>;
------------------------------------------

So, is there a way around this? I really want to "use open ':locale'"
to make sure everything is translated correctly on STDIN/OUT, but I
also want to read from a pipe with a specific encoding. Using
'binmode' doesn't work either, the setting is ignored. I have a
feeling I can solve it by wrapping the open/read in a block and change
the ":std" setting temporarily, but it feels like a kludge.

I would appreciate any comments on this!

Kind regards,
Anders Andersson

<Prev in Thread] Current Thread [Next in Thread>
  • use open ":locale" causes "-|" to ignore explicit encodings, Anders Andersson <=