perl-unicode

Re: Unicode aware module

1999-06-12 15:07:44
On Sat, 12 Jun 1999 17:50:14 EDT, Ilya Zakharevich wrote:
The only addition: being locale-sensitive and UTF-8 encoded is a
property of *data*, not of a Perl script.  An attempt to handle them by
marking sections of code may be noble, but looks like a lost cause.

The C<use byte> I'm talking about has little to do with data.
Perl *operations* behave *differently* depending on the mode in
effect, so C<use byte> would simply be a way to mark sections
where string operations should affect bytes rather than
characters.

How this is related to what I wrote?  You cannot make

   to utf8 or not to utf8

decision based on some properties of a *program/subroutine*.  It is
the properlty of arguments to a subroutine, not of a subroutine.

Nope, "does this code operate on bytes or characters?" is a property
of the subroutine.  It has nothing to do with the data that may be
given to it.  IOW, C<use byte> is unrelated to utf8--it denotes
a property of the code.

But finding a way to easily "mark" data (by marking its source?)
has potential, because then we can detect mismatch between data
and operations.  We may even be able to piggyback on taint magic
for propagating the type.

I do not follow you: if we can detect a mismatch, why not do "a right
thing" instead of complaining?  (Of course, for performance issues
this should be switchable off.)

If you did the "right thing" automatically there would be no way
to tell if you got utf8 data when you were strictly just expecting
utf16 data.  So, no, you can't do the "right thing" automatically.


Sarathy
gsar(_at_)activestate(_dot_)com

<Prev in Thread] Current Thread [Next in Thread>