POD I18N

Porters,

Isn't it possible that we add another directive to specify encoding?

Though pod2man still has some problem handling non ISO-8859-1 text,HTML has no problem and there is no reason you cannot readinternational POD documents via http://search.cpan.org/ so long ascharset= is set correctly (so far search.cpan.org uses ISO-8859-1 forno-matter-what basis. Those pages with UTF-8 texts are handled bycharacter entities such as the name of AUTRIJUS).


To prove my point, see this page.

http://search.cpan.org/author/DANKOGAI/Text-Kakasi-2.00/Kakasi/JP.pod

You see line noise?  But try the following URL to convert it.

http://www.dan.co.jp/~dankogai/cgi/chareset/chareset.cgi?i=eucjp&o=utf8&r=1&u=http://search.cpan.org/author/DANKOGAI/Text-Kakasi-2.00/Kakasi/JP.pod

You may not understand what it says but it should now good on UTF-8savvy browsers.


If you are interested in the source of my charset.cgi, try

http://www.dan.co.jp/~dankogai/cgi/chareset/chareset.cgi/src

So all you need is the ability to specify encoding. No other trickslike transcoding is necessary.(Well, besides that, for search.cpan.org in particular, it is a BADidea to use character entities besides '<>&"' exactly because of that).


Therefore I suggest something like

=encoding UTF-8

so POD parser can see it and generate something like.

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

and make this directive effective till the next occurance of =encodingso you can go like


=head1 AUTHOR

=encoding ISO-8859-1

Dan Kogai <dankogai(_at_)dan(_dot_)co(_dot_)jp>

=encoding UTF-8

小飼　弾 <dankogai(_at_)dan(_dot_)co(_dot_)jp>

=cut

In the same document.

Dan the Man with Too Many Languages to Work with