Gurusamy Sarathy writes:
Say, sv_catsv()
will need to sv_2utf8() one of the arguments if the flags on the
arguments mismatch (or do something similar).
And so will every single extension out there that ever accesses
SvPVX() directly. If there's no caching, SvPV() in an extension
will have to test for and convert every such utf8-encoded string
every time, making a "byte-encoded" copy, and keeping it somewhere
until whatever ran SvPV() is finished with the result.
Why would an extension want to do this? Most will not care about
encodings, since they have no way to use this info.
Those few which *do* care will have a possibility to treat utf8 and
"narrow" SvPV's differently. Note that the situation is not a tiny
bit worse than with the current nonsensical scheme, when the extension
is given an SV, and has no way to understand whether it is utf8 or narrow.
a) printing: all I/O goes via conversion [...]
Then it is as I suspected. :-(
Why sorrow? The default is going to be as with the current scheme.
The only difference is that there is *a possibility* for a sane
behaviour.
??? I repeat: give an example.
Suddenly, I don't see the need to.
An implementation based on magic might be acceptable, but not
what you've sketched out. As always, feel free to convince me
otherwise with an implementation.
Looks like time to pack my belongings...
Ilya