[Encode] Charset-0.01 released

Encode hackers,

With Encode-1.00 released, I felt like a more casual programming so Icame up with Charset-0.01. I'll just post a pod2text-rendered doc toshow you how it is like. My favorite is a one-liner from shell script.


Dan the Yet Another Perl Hacker -- Now in Various Languages
====
NAME
    Charset - write perl codes in any encodings you like

SYNOPSIS
      use Charset "euc-jp"; # Jperl!
      #...
      sub tricky_part{
         no Charset;
         #...
      }
      use Charset "euc-jp"; # restore the state; Filter::Simple bug.
      # Handy for EUC-JP => UTF-8 converter
      # when your text editor only supports Shift_JIS !
      use Charset "shiftjis", IN => "euc-jp", OUT => "utf8";
      # If your shell supports EUC-JP, you can even do this!
      perl -MCharset=euc-jp 'print "Nihongo\n" x 4'

ABSTRACT
    This module allows you to write your perl codes in not only ASCII (or
    EBCDIC where your environment allows) or UTF-8 but any character
    encodings that Encode module supports.

USAGE
    First argument to the "use" line must be the name of encoding which

matches your script. It croaks if none specified or the onespecified is

    unsupported by the Encode module.

You can optionally feed the argument in hash. The followin optionsare

    supported.

    STDIN => *enc_name*

Sets the discipline of STDIN to ":encoding(*enc_name*)". Bydefault,

        the same encoding as the caller script is used.

    STDOUT => *enc_name*
        Sets the discipline of STDOUT to ":encoding(*enc_name*)". By
        default, the same encoding as the caller script is used.

    IN => *enc_name*

Internally does "use open IN => ":encoding(*enc_name*)"". Nodefault

        is set. See open.

    OUT => *enc_name*
        Internally does "use open OUT => ":encoding(*enc_name*)"". No
        default is set. See open.

    IO => *enc_name*

Internally does "use open IO => ":encoding(*enc_name*)"". Nodefault

        is set. IN or OUT overrides this setting.

DESCRIPTION
    This is a technology demonstrator of Perl 5.8.0. It uses Encode and
    Filter::Util::Call, both of which will be inlucuded in perl
    distribution.

Before perl 5.6.0, a character means a byte. Though it was possibletoinclude literals in multibyte characters in certain encodings (suchasEUC-JP), You needed to handle them with care. Some encodings didn'teven

    allow this (such as Shift_JIS) and you needed things like Jperl to do

that. If your multibyte encoding was not Japanese, you were out ofluck.


    As of Perl 5.6.0, you could use UTF-8 strings internally so you could
    apply everything you wanted to do to multilingual string, including
    regexes. You could even use UTF-8 string for identifiers you could go
    like

      my $Ren++; #   "Ren" is really a U+4EBA

to make a child :) But there was one precondition. Your source filemustbe in UTF-8. With decent text editors and environments that canhandleUTF-8 was rare (and still is to some extent), You still neededcharacter

    encoding converters like Jcode.pm

With perl 5.8.0 and this module, this will all change. Your oldscript

    in your regional character encoding suddenly starts working just by
    adding

      use Charset qw(your-encoding);

BUGS

This modules uses Filter::Simple. So it is subject to the limitationof

    Filter::Simple. Filter::Simple and Text::Balance which Filter::Simple
    uses does a pretty good job for block detection

SEE ALSO
    Encode, Filter::Simple, open, PerlIO

AUTHOR
    Dan Kogai <dankogai(_at_)dan(_dot_)co(_dot_)jp>

COPYRIGHT AND LICENSE
    Copyright 2002 by Dan Kogai, all rights reserved.

This library is free software; you can redistribute it and/or modifyit

    under the same terms as Perl itself.