perl-unicode

Re: C<use utf8> dynamic scope?

1999-06-01 15:04:46
On Tue, 1 Jun 1999 15:54:09 -0400, Chip Salzenberg 
<chip(_at_)perlsupport(_dot_)com> said:

On Thu, 27 May 1999 15:55:39 -0400, Chip Salzenberg 
<chip(_at_)perlsupport(_dot_)com> said:
I don't see a use for anything other than UTF-8.  UTF-8 allows the
encoding of huge character codes (up to 40-some bits), so unless you
know of a need for more-than-40-some-bits per character, UTF-8 is
plenty.

FWI, UTF-8 is Latin-centric and not so much liked in Eastern countries
because it is byte-bloat compared to native encodings.

Well, that depends on whether you assume Unicode or not.  If you don't
assume Unicode, you can C<use locale> and go to town.

I can't quite follow you, Chip. Do you want to say UTF-8 doesn't imply
Unicode?

Well, this is the headline of RFC 2279:

              UTF-8, a transformation format of ISO 10646

The intro says

   ISO/IEC 10646-1 [ISO-10646] defines a multi-octet character set
   called the Universal Character Set (UCS), which encompasses most of
   the world's writing systems.

And one paragraph later

   It is noteworthy that the same set of characters is defined by the
   Unicode standard.

I'd say, UTF-8 looks pretty tied to Unicode.

Or do you want to say, those that don't like UTF-8 shall use their
locale to go to town? So what does use locale buy those who would be
willing to use UCS-2 but not use UTF-8?

-- 
andreas

<Prev in Thread] Current Thread [Next in Thread>