Porters,
Isn't it possible that we add another directive to specify encoding?
Though pod2man still has some problem handling non ISO-8859-1 text,
HTML has no problem and there is no reason you cannot read
international POD documents via http://search.cpan.org/ so long as
charset= is set correctly (so far search.cpan.org uses ISO-8859-1 for
no-matter-what basis. Those pages with UTF-8 texts are handled by
character entities such as the name of AUTRIJUS).
To prove my point, see this page.
http://search.cpan.org/author/DANKOGAI/Text-Kakasi-2.00/Kakasi/JP.pod
You see line noise? But try the following URL to convert it.
http://www.dan.co.jp/~dankogai/cgi/chareset/
chareset.cgi?i=eucjp&o=utf8&r=1&u=http://search.cpan.org/author/
DANKOGAI/Text-Kakasi-2.00/Kakasi/JP.pod
You may not understand what it says but it should now good on UTF-8
savvy browsers.
If you are interested in the source of my charset.cgi, try
http://www.dan.co.jp/~dankogai/cgi/chareset/chareset.cgi/src
So all you need is the ability to specify encoding. No other tricks
like transcoding is necessary.
(Well, besides that, for search.cpan.org in particular, it is a BAD
idea to use character entities besides '<>&"' exactly because of that).
Therefore I suggest something like
=encoding UTF-8
so POD parser can see it and generate something like.
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
and make this directive effective till the next occurance of =encoding
so you can go like
=head1 AUTHOR
=encoding ISO-8859-1
Dan Kogai <dankogai(_at_)dan(_dot_)co(_dot_)jp>
=encoding UTF-8
小飼 弾 <dankogai(_at_)dan(_dot_)co(_dot_)jp>
=cut
In the same document.
Dan the Man with Too Many Languages to Work with