URL:
<http://savannah.nongnu.org/bugs/?26577>
Summary: Changed semantic for unpack breaks UTF-8
Project: MHonArc
Submitted by: formorer
Submitted on: Do 14 Mai 2009 12:38:50 GMT
Category: Mail Parsing
Severity: 3 - Normal
Item Group: Incorrect Behavior
Status: None
Privacy: Public
Assigned to: None
Open/Closed: Open
Discussion Lock: Any
Operating System: Linux
Perl Version: 5.10
Component Version: 2.6.16
Fixed Release:
_______________________________________________________
Details:
Hi,
with perl 5.10 the semantic of unpacks U0 parameter changed from a charbased
to a bytebased version [1]. That means that _utf8_to_sgml from CharEnt.pm
fails for multibyte characters. The fix is pretty simple:
-$char = unpack('U0U*',$1);
+$char = unpack('C0U*',$1);
Of course this took me some time to findout ;). You should around it with
a perl version check, but I guess that should be no Problem.
Thanks
Alex - Debian listmaster
[1]
http://search.cpan.org/dist/perl-5.10.0/pod/perl5100delta.pod#Packing_and_UTF-8_strings
_______________________________________________________
Reply to this item at:
<http://savannah.nongnu.org/bugs/?26577>
_______________________________________________
Nachricht geschickt von/durch Savannah
http://savannah.nongnu.org/
---------------------------------------------------------------------
To sign-off this list, send email to majordomo(_at_)mhonarc(_dot_)org with the
message text UNSUBSCRIBE MHONARC-DEV