Re: Archiving sent mail; Attachmants with non-ascii names; Preserving charset of message

2002-12-16 18:11:28
On Mon, 16 Dec 2002, Earl Hood wrote:

I needed to archive sent mail with MHonArc and I needed to put
contents of To: header to mesage index. It was not possible with
MHonArc-2.5.13 so I wrote a small patch that added rc-variable $TO$.

The preferable method is to allow for arbitrary message header
variables instead of just To:.  Otherwise, you end up replicating
code when people want 'cc' or other fields.

Of course. But to make it right you have to had a lot of time and
know a language you work with rather well. Lacking both of it I
choose the ugly one.

I just hope somebody will pick up this and make it the right way. I
just think this is better than nothing.

2. Attachmants with non-ascii names

I had problems with accessing attachments extracted with MHonArc from
Windows if they had non-ascii characters in name or characters
forbidden for file names: \/:*?"<>| (when using m2h_external::filter;

Probably more efficient would be just exclude whitespace and non-ascii
characters in one tr// operation:

  $fname =~ tr/\0-\40\t\n\r\177-\377/_/;

I think characters mentioned earlier, not allowed in Windows
environment, should also not be used in filenames. I don't know if
MHonArc allows '/' or '\' in filenames but at least these cound make
directory traversal vunerabilities possible.

This feature is insufficient.  It assumes that messages only
contain a single text entity part, which of course is wrong when
dealing with MIME messages.

Of course. But it is sufficient in over 99% of mails - again better
not perfect solution than no solution.

As for the un-grepable utf-8, I think people will eventually have
to dealing with it if they want to have archives that are

I think one chosen by configuration iso character set plus numeric
unicode entities for undisplayable in it characters would be perfect

Unfortunately, HTML does not allow mixed-character encodings is
the same document, making things problematic when trying to
convert MIME mail into HTML.

Again - numeric unicode entities. Also some character sets are
convertable to another (for example us-ascii to just about any).
UTF-8 would be more proper solution, chosen character set would be
more practical (yet).

Best wishes
...although Eating Honey was a very good thing to do, there was a
moment just before you began to eat it which was better than when you
                                                      Winnie the Pooh

To sign-off this list, send email to majordomo(_at_)mhonarc(_dot_)org with the