perl-unicode

Re: Encode, take two

2000-09-13 07:28:11

    Jarkko> Assume I have a string a bunch of bytes that makes sense in
    Jarkko> Shift-JIS, as Shift-JIS characters.  Now, how I am going to get it
    Jarkko> to Unicode?  chars_to_blah() won't help since they are not yet in
    Jarkko> Unicode chars.  So yes, I think you are right, we need the
    Jarkko> bytes_to_utf8(), and bytes_to_chars() is then a natural
    Jarkko> convenience wrapper.

Taking the view that "bytes are bytes" or "bytes are text in some encoding",
then bytes_to_utf8() and utf8_to_bytes() should take an encoding parameter.
Then chars_to_utf8() and utf8_to_chars() don't need an encoding parameter
because they simply convert between Unicode characters and UTF-8.

Or is there some other factor I've missed in all the confusion?
-----------------------------------------------------------------------------
Mark Leisher
Computing Research Lab            Cinema, radio, television, magazines are a
New Mexico State University       school of inattention: people look without
Box 30001, Dept. 3CRL             seeing, listen without hearing.
Las Cruces, NM  88003                            -- Robert Bresson

<Prev in Thread] Current Thread [Next in Thread>