perl-unicode

Re: Encode, take three

2000-09-13 06:31:43
=head2 bytes

    bytes_to_utf8(STRING)

The bytes in STRING are encoded in-place into UTF-8.  Returns the new
size of STRING, or undef if there's a failure.  [INTERNAL] Also the
UTF-8 flag is turned on.

Is this a C or a perl API ?

Perl.

If a perl API then converting to UTF8 means that substr() is going 
to give me a sequence of bytes which encode the string. As such they
have to have the internal UTF8 flag turned off.


=head2 chars

    chars_to_utf8(STRING)

The chars in STRING are encoded in-place into UTF-8.  The chars are
asssumed to be encodedin ISO 8859-1 (Latin 1) or US-ASCII.  

You took my name and used it exactly the opposite way to what I intended.
Maybe my name was not as clear as I thought.

Please take a look at take five:

:         chars_to_utf8(STRING, ENCODING[, CHECK])
: 
: The chars in STRING encoded in ENCODING are recoded in-place into
: UTF-8.  Returns the new size of STRING, or C<undef> if there's a failure.
: 
: No assumptions are made on the encoding of the chars.  If you want to
: assume that the chars are Unicode and to trap illegal Unicode
: characters, you must use C<from_to('Unicode', ...)>.
: 
: [INTERNAL] Also the UTF-8 flag of STRING is turned on.
: 
:         utf8_to_chars(STRING)
: 
: The UTF-8 in STRING is decoded in-place into chars.  Returns the new
: size of STRING, or C<undef> if there's a failure. 
: 
: If the UTF-8 in STRING is malformed C<undef> is returned, and also an
: optional lexical warning (category utf8) is given.
: 
: [INTERNAL] The UTF-8 flag of STRING is not checked.

My intent was that STRING is _ANY_ string in perl's internal representation.
The returned string is a sequence of bytes (0..255) which are the 
encoding of that string.

-- 
$jhi++; # http://www.iki.fi/jhi/
        # There is this special biologist word we use for 'stable'.
        # It is 'dead'. -- Jack Cohen

<Prev in Thread] Current Thread [Next in Thread>