perl-unicode

Re: beginniner's 5.6.1 latin1<->utf8 question

2003-01-10 12:30:05
On Fri, Jan 10, 2003 at 07:28:00PM +0100, Merijn van den Kroonenberg wrote:
You might be looking for these:


    # ISO 8859-1 to UTF-8
    s/([\x80-\xFF])/chr(0xC0|ord($1)>>6).chr(0x80|ord($1)&0x3F)/eg;

    # UTF-8 to ISO 8859-1
    s/([\xC2\xC3])([\x80-\xBF])/chr(ord($1)<<6&0xC0|ord($2)&0x3F)/eg;

I think that will work (they are not mine, so don't blame me if not ;-)

They are mine :-) so I feel free to say that they don't &#NNN;
conversion... but they certainly could be changed to work so.

Greetings, Merijn

----- Original Message -----
From: "Narins, Josh" <josh(_dot_)narins(_at_)lehman(_dot_)com>
To: <perl-unicode(_at_)perl(_dot_)org>
Sent: Friday, January 10, 2003 6:54 PM
Subject: beginniner's 5.6.1 latin1<->utf8 question



At one point I had a regex which perfectly converts the string A below
into
a series of &#234; strings.
This is nice for me, because I just sling them out on the web, and as
entities, they always seem to work.

I've lost the regex, can't seem to find it. I know it had chr or ord in
it.

I've been reading the perl-unicode archives, and googling, but I just
don't
see it.

This is for perl5.6.1 with Sun's (reputedly?) sick iconv.

If someone could tap me in the right direction...

Thx in advance

--------------------------------------------------------------------------
----
This message is intended only for the personal and confidential use of the
designated recipient(s) named above.  If you are not the intended recipient
of this message you are hereby notified that any review, dissemination,
distribution or copying of this message is strictly prohibited.  This
communication is for information purposes only and should not be regarded as
an offer to sell or as a solicitation of an offer to buy any financial
product, an official confirmation of any transaction, or as an official
statement of Lehman Brothers.  Email transmission cannot be guaranteed to be
secure or error-free.  Therefore, we do not represent that this information
is complete or accurate and it should not be relied upon as such.  All
information is subject to change without notice.



-- 
Jarkko Hietaniemi <jhi(_at_)iki(_dot_)fi> http://www.iki.fi/jhi/ "There is this 
special
biologist word we use for 'stable'.  It is 'dead'." -- Jack Cohen

<Prev in Thread] Current Thread [Next in Thread>