perl-unicode

[Encode] encoding vs. Charset

2002-03-30 06:31:14
On Saturday, March 30, 2002, at 08:00 , Nick Ing-Simmons wrote:
Autrijus Tang <autrijus(_at_)autrijus(_dot_)org> writes:

And then you'll ahve to disambiguate between that and encoding.pm...
Why aren't we extending encoding.pm instead?

That was my thought as well - that there is overlap with Jarkko's
work use encoding.

Well, I can make encoding.pm a part of Encode project upon jhi's approval. I wanted to work on its pod at least. But merging Charset.pm I am not sure if it is a good idea. Let me list up the differences....

= Implementation
        * encoding.pm           
                By setting ${^ENCODING} global variable
                (Undocumented in perlvar but $ENV{PERL_ENCODING} is documented)
        * Charset.pm
                By applying source filter to the caller

= Behaviors
        * encoding.pm
                * Source untouched. literals stay as is
                * No scoping since it does nothing but setting global variable
                  (which is actually a code refto Encode object)
        * Charset.pm
                * Source rewritten on-the fly, including literals in "".
                * Crude scoping via Filter::Simple

= Pros & Cons
        * encoding.pm
                + Faster.  Much faster when the caller source is convoluted.
                + Safer.  No unexpected warning in unexpected line #
                - No scoping.
                - no "no encoding;" so far. (Trivial to implement, however)
        * Charset.pm
                + More comprehensive.  With even literals converted, existing
                  codes have more chances
                + Supports IO as well.  But this is feature is not that hard
           to integrate to encoding.pm
                - (Much) Slower.
                - scoping too crude.

Because of source filtering which is more dangerous (I consider Charset.pm not so much more than a technology demonstrator but encoding may (and must) be as robust as Encode and PerlIO), I think Charset.pm should stay different from encoding.pm. At the same time I would like to work on encoding.pm to polish its pod and (perhaps) IO support.
  So what do you say, jhi?

Dan the encoding Man