On Sun, 13 Jun 1999 15:11:49 EDT, Ilya Zakharevich wrote:
Now please reread what I wrote. Perl strings are not sequences of
bytes any more (as they were in 5.005).
That may be what you wish, but it is not fact. SvCUR() (or sv_len()
if you prefer) still count just bytes, not characters.
With the current
implementation of UTF-8 it is misleading, since wideness" is attached
to the code, not the data. My argument is that attaching wideness to
data instead of the code we can make wideness *transparent* without
sacrifying performance for the operations which do not *require*
wideness.
I'm beginning to see where you are not "getting" it. You are
arguing from the POV of some mythical implementation that doesn't
exist.
Then the absense of a global utf8 switch becomes an optimization only
- all the program work exactly the same (or better ;-) with the
addition of a global-utf8 switch.
Providing a patch may help understand this mythical implementation
of yours where SVs are characters, not bytes.
Sarathy
gsar(_at_)activestate(_dot_)com