perl-unicode

Re: Unicode aware module

1999-06-16 14:00:51
On Wed, 16 Jun 1999 14:51:08 EDT, Ilya Zakharevich wrote:
On Wed, Jun 16, 1999 at 11:34:52AM -0700, Gurusamy Sarathy wrote:
If I understand you correctly, you are suggesting that Perl should
use utf8 (not bytes) as its internal representation for all data.

Absolutely not.  I suggested that Perl *can* use utf8 (not bytes) as
its internal representation for *any* data.  Then 'use utf8' will switch
OPs to ones which are able to distinguish whether a given SV is utf8
or byte-encoded (most OPs do not care).

So then, what happens when a utf8-encoded SV is passed to an OP
that doesn't want it?  How does it "see" the real data?  It either
has to convert it (in order to, say, print it to a file, call
some system API, etc.) or the utf8-encoded SV has to have a cached
copy of the "byte-encoded" data.  I don't think you mean the latter.

Skipping characters will not be a simple C<string++>; instead,
it will need to be done with C<string += UTF8SKIP(string)>.

???  *When* do you skip chars?  The opcodes which need it do already
do it this way.

Not as far as eye can see.  Unless you restrict what type of SVs
can be passed to what OPs, all OPs have to deal with utf8-encoded
SVs (just as all OPs have to deal with magic for magic to work
correctly).


Sarathy
gsar(_at_)activestate(_dot_)com

<Prev in Thread] Current Thread [Next in Thread>