perl-unicode

Re: C<use utf8> dynamic scope?

1999-06-01 18:11:35
According to Andreas J. Koenig:
On Tue, 1 Jun 1999 15:54:09 -0400, Chip Salzenberg 
<chip(_at_)perlsupport(_dot_)com> said:

On Thu, 27 May 1999 15:55:39 -0400, Chip Salzenberg 
<chip(_at_)perlsupport(_dot_)com> said:
I don't see a use for anything other than UTF-8.  UTF-8 allows the
encoding of huge character codes (up to 40-some bits), so unless you
know of a need for more-than-40-some-bits per character, UTF-8 is
plenty.

FWI, UTF-8 is Latin-centric and not so much liked in Eastern countries
because it is byte-bloat compared to native encodings.

 > Well, that depends on whether you assume Unicode or not.  If you don't
 > assume Unicode, you can C<use locale> and go to town.

I can't quite follow you, Chip. Do you want to say UTF-8 doesn't imply
Unicode?

That's not what I meant, exactly; I meant that you can just skip the
whole Unicode/UTF-8 issue and use locales and eight-bit characters --
i.e. the status quo ante -- if you like.

However:  *Yes*, UTF-8 is independent of Unicode.

Well, this is the headline of RFC 2279:
      UTF-8, a transformation format of ISO 10646

But if you look at what UTF-8 really *is*, it's nothing more than a
way of encoding integers larger than eight bits using sequences of
eight-bit bytes with their high bits set.

Just because the inventors of UTF-8 use it only for Unicode doesn't
mean that we are also so constrained.

So what does use locale buy those who would be willing to use UCS-2
but not use UTF-8?

Nothing.  Separate issue.
-- 
Chip Salzenberg      - a.k.a. -      <chip(_at_)perlsupport(_dot_)com>
      "When do you work?"   "Whenever I'm not busy."

<Prev in Thread] Current Thread [Next in Thread>