perl-unicode

Bug in Encode::encode('MIME-Q', $iso_8859_1_string)

2005-11-23 13:57:34
Hi,

this seems to be a bug:
a)
perl -MHTML::Entities -MEncode -e '$a="abcÄ–def";
print encode("MIME-Q", HTML::Entities::decode($a)), "\n";'

Result:
=?UTF-8?Q?abc=C3=84=E2=80=93def?=

b)
perl -MHTML::Entitientities -MEncode -e '$a="abcÄ
print encode("MIME-Q", HTML::Entities::decode($a)), "\n";'

Result:
=?UTF-8?Q?abc=C4def?=

In a) the string contains "–" to force UTF-8 (the result from
HTML::Entities::decode will not fit into ISO-8859-1).

In b) the result of HTML::Entities::decode is of ISO-8859-1, not UTF-8.
The result of b) is wrong. Encode::encode() doesn't seem to properly
consider the charset of the string in this case. I think the correct result is
=?UTF-8?Q?abc=C3=84def?=

FYI, when using MIME-B encoding, the results are
=?UTF-8?B?YWJjw4TigJNkZWY=?= (with decoded '–') and
=?UTF-8?B?YWJjw4RkZWY=?=     (without).

I believe both are correct.

Cheers,
-Sven
PS: I'm using perl 5.8.7

<Prev in Thread] Current Thread [Next in Thread>
  • Bug in Encode::encode('MIME-Q', $iso_8859_1_string), Sven Neuhaus <=