mhonarc-users

Re: Subjects,threads and MINE

2002-08-08 13:04:36
On August 8, 2002 at 16:18, "Birger Mortensen" wrote:

I am using MHonArc v2.5.7

You may want to upgrade to v2.5.11.

Linking of threads works great BUT not if the subject is mine encodet like
this
Subject: =?iso-8859-1?Q?Mellem=F8sten_og_Moses_Hansen?=
And the next one that is a replay to the first one
Subject: =?iso-8859-1?Q?RE=3A_=5BDebat=5D_Mellem=F8sten_og_Moses_Hansen?=
it don't work.

Non-ASCII encoded text is known to cause some problems with certain
functional aspects of MHonArc.  It's an ugly problem when trying to
support charset soup.

I Have removed the "=5BDebat=5D" in the list (by using  <SUBJECTSTRIPCODE> )
The string is the prefix for the list ( In clear text [Debat] )
But it looks like MhonArc don't use the same resouce for linkning subjects
I think that i need to tel MhonArc that "=3A" is ":" and that "_"  is " "
( space )

Any one got a <SUBJECTREPLYRXP> that can do the trick  I am using

<SUBJECTREPLYRXP>
\s*(re|sv|fwd|fw|Re|Sv|Fwd|Fws|RE|SV)[\[\]\d]*[:>-]+\s*
</SUBJECTREPLYRXP>

Yes no starting  "^"
Is this the way to go or is there a way to get MHonArc to use the decodet
subject  after SUBJECTSTRIPCODE just as the $SUBJECTNA$ looks

A possible solution:

If you known the mail to your archives are using the same character
set (in your case iso-8859-1), then you could try:

  <DecodeHeads>
  <CharsetConverters>
  iso-8859-1; -decode-
  </CharsetConverters>

Therefore, the iso-8859-1 encoded data will be decoded when the
mail is first read into MHonArc and the header data will be stored
in decoded form.

You can also do the following (from the Example in DECODEHEADS):

  <DecodeHeads>
  <CharsetConverters override>
  plain; mhonarc::htmlize;
  default; -decode-
  </CharsetConverters>

This forces all data to be mapped into the locale you are using.
It is not playing nice, but if your archives only see mail from
one locale, then it does simplifies things.  And even if there is
slight variants (eg, iso-8859-1, windows-1252), the anomolies may
be tolerable.

If you want to try to deal with all the variant charsets properly,
you could try converting all mail into Unicode.  There is an example
utf-8.mrc resource file in v2.5.11 that can get you started (make
sure to the review the Notes listed at the beginning of the resource
file).

However, the example utf-8.mrc does not deal with non-ASCII encoded
data issue you are experiencing, and a solution is more complicated
and requires some Perl programming.  If there is interest, I could
come up with the some code and how to hook it in.

--ewh

---------------------------------------------------------------------
To sign-off this list, send email to majordomo(_at_)mhonarc(_dot_)org with the
message text UNSUBSCRIBE MHONARC-USERS

<Prev in Thread] Current Thread [Next in Thread>