perl-unicode

RE: beginniner's 5.6.1 latin1<->utf8 question

2003-01-16 10:30:09
Turns out that all that was unnecessary, and the program would have saved
and restored the UTF fine if I just hadn't tried to blindly untaint the data
with...

sub untaint_blind {
   $_[0] =~ /^(.*)$/;
   my $ret = $1;
   $ret;
}

This is perl5.6.1



-----Original Message-----
From: Jarkko Hietaniemi [mailto:jhi(_at_)iki(_dot_)fi] 
Sent: Friday, January 10, 2003 1:39 PM
To: Merijn van den Kroonenberg
Cc: Narins, Josh; perl-unicode(_at_)perl(_dot_)org
Subject: Re: beginniner's 5.6.1 latin1<->utf8 question


On Fri, Jan 10, 2003 at 07:28:00PM +0100, Merijn van den Kroonenberg wrote:
You might be looking for these:


    # ISO 8859-1 to UTF-8
    s/([\x80-\xFF])/chr(0xC0|ord($1)>>6).chr(0x80|ord($1)&0x3F)/eg;

    # UTF-8 to ISO 8859-1
    s/([\xC2\xC3])([\x80-\xBF])/chr(ord($1)<<6&0xC0|ord($2)&0x3F)/eg;

I think that will work (they are not mine, so don't blame me if not 
;-)

They are mine :-) so I feel free to say that they don't &#NNN; conversion...
but they certainly could be changed to work so.

Greetings, Merijn

----- Original Message -----
From: "Narins, Josh" <josh(_dot_)narins(_at_)lehman(_dot_)com>
To: <perl-unicode(_at_)perl(_dot_)org>
Sent: Friday, January 10, 2003 6:54 PM
Subject: beginniner's 5.6.1 latin1<->utf8 question



At one point I had a regex which perfectly converts the string A 
below
into
a series of &#234; strings.
This is nice for me, because I just sling them out on the web, and 
as entities, they always seem to work.

I've lost the regex, can't seem to find it. I know it had chr or ord 
in
it.

I've been reading the perl-unicode archives, and googling, but I 
just
don't
see it.

This is for perl5.6.1 with Sun's (reputedly?) sick iconv.

If someone could tap me in the right direction...

Thx in advance

--------------------------------------------------------------------
------
----
This message is intended only for the personal and confidential use 
of the
designated recipient(s) named above.  If you are not the intended 
recipient of this message you are hereby notified that any review, 
dissemination, distribution or copying of this message is strictly 
prohibited.  This communication is for information purposes only and 
should not be regarded as an offer to sell or as a solicitation of an 
offer to buy any financial product, an official confirmation of any 
transaction, or as an official statement of Lehman Brothers.  Email 
transmission cannot be guaranteed to be secure or error-free.  
Therefore, we do not represent that this information is complete or 
accurate and it should not be relied upon as such.  All information is 
subject to change without notice.



-- 
Jarkko Hietaniemi <jhi(_at_)iki(_dot_)fi> http://www.iki.fi/jhi/ "There is this 
special
biologist word we use for 'stable'.  It is 'dead'." -- Jack Cohen

------------------------------------------------------------------------------
This message is intended only for the personal and confidential use of the 
designated recipient(s) named above.  If you are not the intended recipient of 
this message you are hereby notified that any review, dissemination, 
distribution or copying of this message is strictly prohibited.  This 
communication is for information purposes only and should not be regarded as an 
offer to sell or as a solicitation of an offer to buy any financial product, an 
official confirmation of any transaction, or as an official statement of Lehman 
Brothers.  Email transmission cannot be guaranteed to be secure or error-free.  
Therefore, we do not represent that this information is complete or accurate 
and it should not be relied upon as such.  All information is subject to change 
without notice.


<Prev in Thread] Current Thread [Next in Thread>