perl-unicode

Re: Encode, take four

2000-09-13 20:58:54
On Tue, Sep 12, 2000 at 03:59:15PM -0500, Jarkko Hietaniemi 
<jhi(_at_)iki(_dot_)fi> wrote:
        bytes_to_utf8(STRING)

The bytes in STRING are encoded in-place into UTF-8.  Returns the new
size of STRING, or undef if there's a failure.  [INTERNAL] Also the
UTF-8 flag is turned on.

Just to give another example in case my previous mail wasn't clear: Marking
the setting of the utf-8-string as [INTERNAL] (== might change) makes this
function 100% useless, since it is the same as saying:

"might return something represented internally as utf-8, but how perl
treats this scalar is undefined".

(Also: what is the size of a string???)

The above can be applied to all other functions as well: It is crucial
to know how perl treats your character data. I know that, at the moment,
perl sometimes treats scalars as bytes and sometimes as utf-8, with wild
re-encodings going on, but I really _hope_ that all of these are regarded
as bugs.

        is_utf8(STRING [, STRICT])

[INTERNAL]

Oh, a function for gururs only....

=head2 Toggling UTF-8-ness

        on_utf8(STRING)

BTW, I would appreciate utf8_on instead of on_utf8, because:

- it's the same as the C API. They should be the same.
- Convert::Scalar uses utf8_on (a weak reason)
- on_utf8 is a question, utf8_on is a command.

-- 
      -----==-                                             |
      ----==-- _                                           |
      ---==---(_)__  __ ____  __       Marc Lehmann      +--
      --==---/ / _ \/ // /\ \/ /       pcg(_at_)opengroup(_dot_)org |e|
      -=====/_/_//_/\_,_/ /_/\_\       XX11-RIPE         --+
    The choice of a GNU generation                       |
                                                         |

<Prev in Thread] Current Thread [Next in Thread>