Jeff Breidenbach <jeff(_at_)jab(_dot_)org> wrote:
This is the problem, HTML does not support mixed character sets.
Also, the charset affects the entire HTML document. Therefore, your
resource settings would have to conform with the charset, and this
can be a big problem if messages existing in the archive have different
specified charsets. It would be hard to guarantee that all messages
will use the same charset.
I think I understand ... is this right?
If an single email contains two different character sets,
you're screwed, I understand that.
A potential solution would be to put the different message parts into
different files in the archive, and use the remainder of the message as
a container for URLs to those files, mimicking the MIME message
structure in HTML. I haven't looked to see how/if one can do that in
MHonarc, but this seems like a problem similar to archiving a multi-part
with multiple graphic inclusions.
If two emails are received, each with a different character set
1) you are screwed on index pages, which will has a bunch
of subject lines from different character sets
Perhaps translate those to Unicode, as I think you're suggesting.
OTOH, char-sets in headers may be encoded in the RFC 2047 style,
so any translation may break out as a different problem anyway.
2) you are screwed on message pages, because navigational aids
like the word "follow-ups" will be in a different character set
from the messages.
Rely on a graphical interface instead of text?
BTW, while I applaud the desire to display localized headers I hope that
any reply/follow-up interface is sending the canonical RFC 822 & later
headers and keywords "on the wire", and not helping create messages like:
Subject: Re: Sv: Re: Ab: Re: