Hi,
I've made a tiny PerlFixupHandler to guess files' encodings and
automatically add charset attribute into Content-Type accorging to the
guess, using Encode::Guess and I18N::Charset.
This module can be a powerfull replacement/suppliment for Apache's
Add*Charset stuff. The module can be downloaded from
http://bulknews.net/lib/archives/Apache-GuessCharset-0.01.tar.gz
Well, this module uses I18N::Charset internally to translate encoding
names into IANA registerd name (and its own table to convert it to
preferred MIME again). How about porting this module into Encode,
which might be useful?
Thanks,
--
Tatsuhiko Miyagawa <miyagawa(_at_)edge(_dot_)co(_dot_)jp>
NAME
Apache::GuessCharset - adds HTTP charset by guessing file's encoding
SYNOPSIS
PerlModule Apache::GuessCharset
SetHandler perl-script
PerlFixupHandler Apache::GuessCharset
# how many bytes to read for guessing (default 512)
PerlSetVar GuessCharsetBufferSize 1024
# list of encoding suspects
PerlSetVar GuessCharsetSuspects euc-jp
PerlAddVar GuessCharsetSuspects shiftjis
PerlAddVar GuessCharsetSuspects 7bit-jis
DESCRIPTION
Apache::GuessCharset is an Apache handler which adds HTTP charset
attribute by automaticaly guessing file' encodings via Encode::Guess.
CONFIGURATION
This module uses following configuration variables.
GuessCharsetSuspects
a list of encodings for "Encode::Guess" to check. See the
Encode::Guess manpage for details.
GuessCharsetBufferSize
specifies how many bytes for this module to read from source file,
to properly guess encodings. default is 512.
AUTHOR
Tatsuhiko Miyagawa <miyagawa(_at_)bulknews(_dot_)net>
This library is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.
SEE ALSO
the Encode::Guess manpage, the Apache::File manpage