perl-unicode

POD I18N

2003-05-25 12:30:09
Porters,

Isn't it possible that we add another directive to specify encoding?

Though pod2man still has some problem handling non ISO-8859-1 text, HTML has no problem and there is no reason you cannot read international POD documents via http://search.cpan.org/ so long as charset= is set correctly (so far search.cpan.org uses ISO-8859-1 for no-matter-what basis. Those pages with UTF-8 texts are handled by character entities such as the name of AUTRIJUS).

To prove my point, see this page.

http://search.cpan.org/author/DANKOGAI/Text-Kakasi-2.00/Kakasi/JP.pod

You see line noise?  But try the following URL to convert it.

http://www.dan.co.jp/~dankogai/cgi/chareset/ chareset.cgi?i=eucjp&o=utf8&r=1&u=http://search.cpan.org/author/ DANKOGAI/Text-Kakasi-2.00/Kakasi/JP.pod

You may not understand what it says but it should now good on UTF-8 savvy browsers.

If you are interested in the source of my charset.cgi, try

http://www.dan.co.jp/~dankogai/cgi/chareset/chareset.cgi/src

So all you need is the ability to specify encoding. No other tricks like transcoding is necessary. (Well, besides that, for search.cpan.org in particular, it is a BAD idea to use character entities besides '<>&"' exactly because of that).

Therefore I suggest something like

=encoding UTF-8

so POD parser can see it and generate something like.

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

and make this directive effective till the next occurance of =encoding so you can go like

=head1 AUTHOR

=encoding ISO-8859-1

Dan Kogai <dankogai(_at_)dan(_dot_)co(_dot_)jp>

=encoding UTF-8

小飼 弾 <dankogai(_at_)dan(_dot_)co(_dot_)jp>

=cut

In the same document.

Dan the Man with Too Many Languages to Work with

<Prev in Thread] Current Thread [Next in Thread>