I have found out how to create a utf8 string: insert something with a code
255 (a BOM should do it) and then strip it off later. Hacky, but works.
But how do I change the way a string is interpretted?
use utf8;
# other code
sub pretty
{
my ($str) = @_;
# $str =~ tr///CC; # This crashes Perl 5.6.0 (ActivePerl)
# use bytes; # This does nothing
$str =~ s/([\xc0-\xff][\x80-\xbf]+)/'\x{'.sprintf("%04x", unpack("U",
$1)).'}'/oge;
$str;
}
$str is interpretted as UTF8 (SvUTF8 is set).
Any suggestions?
And a follow-up question:
How do I make a UTF8 string containing codes 127<x<256 without having to insert
a BOM in the front and then strip it off?
Martin Hosken
PS. Apologies for the vague previous question.