Maybe I am simply confused, but why are we talking about
lumbering the Perl kernel with ISO 2022? If Markus wants to
write applications which conform to such standards, in Perl,
then he can. There is nothing to stop him, and thanks to the
early design decisions of the Unicode committee he will also
get round trip compatibility for many of his encodings.
As far as application specific data is concerned, is it not the
duty of the application to address the encoding issues rather
than requiring Perl to do all the work for you?
There is only one file format that Perl needs to be concerned
with internally, and that is source code. If we are needing to
handle source coding in a variety of encodings, then perhaps
the solution would be to have a short Perl program which
decides which filter to use on the source files it is asked to
open. I'm not sure whether this is currently in 5.6. This then
allows people to write any filter they want for any encoding of
a source program.
The problem then becomes what encoding to write the filter
identifier in. But I would think the default Perl encoding
(ASCII/UTF8) is sufficient for this. In addition there is the
problem of getting the filter identifier program to be used
when the Perl code is being read. If the program could be
expressed as a module, or via a new command line option (say
-f), then Perl can be told which program to filter code through
on start up. In fact, in many cases, where the encoding is
ASCII conformant (for lower ASCII that is), then the #! can do
the work for you.
For all the criticisms made against it, I think an approach
similar to that used in XML for identifying the encoding to use
to search for the #! would serve very well.
I don't know whether this is too high a price to pay, but I
think it may solve the multiplicity of encodings for code,
issue.
Martin Hosken