Re: archiving UTF-8 encoded messages

2000-12-23 14:26:48
On December 19, 2000 at 10:04, Eric Marsden wrote:

I see that this subject has been touched on in the list archives. It
would be nice if MHonarc were able correctly to archive UTF-8 encoded
messages. Currently a false http-equiv "charset=ISO-8859-1" is used.

MHonArc does not set http-equiv be default.  If the meta tag
is showing up, then it is in whatever resource file you are using.

Assuming that common browsers can handle this, would it be difficult
(and sufficient?) to use a "charset=UTF-8" http-equiv instead? Perhaps
by augmenting %readmail::MIMECharSetConverters?

Unicode support has not been integrated into MHonArc.  Charset
converters need to be created and then some mechanism for a filter
to specify what the charset is for the entire message page written.
Since that is nothing prohibiting a raw message from have parts
with different charsets, then a routine is needed that convert
from various charsets into Unicode/UTF-8.  I have not checked
in awhile if there are any pure Perl implementations of character
set translation modules.  I also have not checked to see what
the level of UTF-8 support there is in Perl 5.6.

Any contributions are welcome,


<Prev in Thread] Current Thread [Next in Thread>