mhonarc-dev

[bug #26577] Changed semantic for unpack breaks UTF-8

2009-05-14 08:39:08

URL:
  <http://savannah.nongnu.org/bugs/?26577>

                 Summary: Changed semantic for unpack breaks UTF-8
                 Project: MHonArc
            Submitted by: formorer
            Submitted on: Do 14 Mai 2009 12:38:50 GMT
                Category: Mail Parsing
                Severity: 3 - Normal
              Item Group: Incorrect Behavior
                  Status: None
                 Privacy: Public
             Assigned to: None
             Open/Closed: Open
         Discussion Lock: Any
        Operating System: Linux
            Perl Version: 5.10 
       Component Version: 2.6.16
           Fixed Release: 

    _______________________________________________________

Details:

Hi, 

with perl 5.10 the semantic of unpacks U0 parameter changed from a charbased
to a bytebased version [1]. That means that _utf8_to_sgml from CharEnt.pm
fails for multibyte characters. The fix is pretty simple:

-$char = unpack('U0U*',$1);
+$char = unpack('C0U*',$1);

Of course this took me some time to findout ;). You should around it with
a perl version check, but I guess that should be no Problem. 

Thanks

Alex - Debian listmaster

[1]
http://search.cpan.org/dist/perl-5.10.0/pod/perl5100delta.pod#Packing_and_UTF-8_strings





    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/bugs/?26577>

_______________________________________________
  Nachricht geschickt von/durch Savannah
  http://savannah.nongnu.org/

---------------------------------------------------------------------
To sign-off this list, send email to majordomo(_at_)mhonarc(_dot_)org with the
message text UNSUBSCRIBE MHONARC-DEV

<Prev in Thread] Current Thread [Next in Thread>