perl-unicode

Re: MD5 digest of UTF-8 string in Perl 5.8

2002-10-24 05:30:05
Gisle Aas wrote on 2002-10-23 15:01 UTC:
    md5_hex(Encode::encode_utf8($string))

Thanks, that looks indeed like the proper solution.

Juha-Mikko Ahonen wrote on 2002-10-23 14:42 UTC:
  $ perl -e 'use Digest::MD5 qw(md5_hex); print md5_hex("\x{20ac}");'
  Wide character in subroutine entry at -e line 1.

The problem is in \x{20ac}. If you place the character in UTF-8 encoding 
in place of the escape, it works perfectly. If you have real UTF-8 
data, not perl escapes, then there is no problem.

I'm afraid, this didn't make sense to me. The internal representation of
the input value of the MD5 function should not depend on whether I used
the UTF-8 character of the hex escape notation in the source code. The
Perl compiler should eliminate this difference already in the scanner.

Markus

-- 
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org,  WWW: <http://www.cl.cam.ac.uk/~mgk25/>

<Prev in Thread] Current Thread [Next in Thread>