perl-unicode

Re: Unicode aware module

1999-06-17 14:31:57
On Thu, Jun 17, 1999 at 12:17:34PM -0700, Larry Wall wrote:
I'd say that if Tk has blithely changed its interface to take utf-8
instead of latin-1 then it has *broken* its contract with the user.

"Taking" has nothing to do with the internal representation.  Tcl
changed the internal representation.  This change made it
*exceptionally easy* to work with heterogeneous data: you tell Tcl
what is the encoding of each i/o channel, and proceed.  Since the
default is what you call "Latin-1", this change is transparent to the
old script, which did not change encodings.

All this discussion misses the point that we're dealing with a bunch of
existing interfaces.  The old interfaces *specify* narrow characters.
There is no way to wave a magic wand over these interfaces and expect
the modules behind those interfaces to suddenly start behaving both
differently and coherently.

However, one can provide *tools* which can make *modules* behave
coherently.  And we can make the *kernel* behave coherently.

I'm not against schemes for autogenerating utf-8 aware modules from
non-utf-8 aware modules where that's practical.  But we must be aware
that it's a different module with a different interface, and that the
correctness of automatic translation is about as decidable as the
halting problem.

Proving macros SvPV (which is as efficient as now, but ignores
encoding), SvPVnarrow (with an additional argument to indicate failure
of the translation), and SvPVwide is all that is needed for a
quick-and-dirty correction of modules.

Ilya

<Prev in Thread] Current Thread [Next in Thread>