On Wed, Jun 16, 1999 at 11:34:52AM -0700, Gurusamy Sarathy wrote:
If I understand you correctly, you are suggesting that Perl should
use utf8 (not bytes) as its internal representation for all data.
Absolutely not. I suggested that Perl *can* use utf8 (not bytes) as
its internal representation for *any* data. Then 'use utf8' will switch
OPs to ones which are able to distinguish whether a given SV is utf8
or byte-encoded (most OPs do not care).
This sounds like a good thing *in theory*, but in practice, I do
not see how it can be implemented without slowing things down
considerably. I/O operations will need to copy things around (or
loop through every byte read) to convert things from bytes to/from
utf8.
Nope, they will just *NOT* mark the data as utf8.
Skipping characters will not be a simple C<string++>; instead,
it will need to be done with C<string += UTF8SKIP(string)>.
??? *When* do you skip chars? The opcodes which need it do already
do it this way.
Ilya