Steven wrote:
1) Replacing par does indeed fix one of the three failed tests.
Progress!
...so clearly I need to replace elinks in my html_to_text script, and doing
that will solve the problem that prompted this discussion, leaving the
following questions:
1) What's the best replacement for elinks?
mhn.defaults.sh looks for text/html helpers in this order:
1. w3m
2. lynx
3. elinks
I don't know if one is necessarily "better" than another.
If you have suggestions on how to improve the arguments that mhn.defaults.sh
uses for elinks, please let us know.
2) Should I replace my 1.7.1 installation by the version I just built?
Basically I'm asking what benefits the current snapshot has over
1.7.1,
See docs/pending-release-notes.
and how far away the next numbered release might be.
Unknown. Ken appears to be busy. One of us here could push it out. It's
been almost 4 years so I think that would be a good idea. Perhaps after
things here settle down a bit.
3) How can I guarantee that messages will be saved with quoted-printable
or base64 parts decoded, without patching mhfixmsg to deal with
messages in which the decoded text would be more than 998 characters
long?
I don't know your reason for patching mhfixmsg. IIRC, you were using
-decodetext 8bit; binary instead of 8bit might help. The mhfixmsg man
page might provide some insight.
That raises some further questions:
- Why wasn't the text/html part converted to utf-8?
mhfixmsg only converts the character set of text/plain. That was a
design decision. Other subtypes can be extracted with mhstore and run
through iconv. If there's a use for converting them in place in
mhfixmsg, it wouldn't be difficult but I'm not sure how useful it
would be.
- Regardless of the answer to the previous question, after a
message has been refiled (and assuming I'm not planning to
resend it to anyone), is there a practical difference between
binary and 8bit encoding?
"Note that -decodetext binary can produce messages that are not compliant
with RFC 5322, §2.1.1."
- Why are the headers of the decoded message identical to those
of the input, despite the use of -decodeheaderfieldbodies?
(...and yes, the unmodified version of the message does contain
some encoded headers that my decode_headers program found and
decoded; mhfixmsg appears not to have done so).
Is it a proper MIME message (does mhfixmsg return with a non-zero exit
status)? If so, can you send it to me off-line?
The test suite has a case, boiled down a bit here:
$ cat test1
To: recipient@example.com
From: sender@example.com
Date: Wed, 28 Sep 2016 11:24:28 -0400
Subject: ?utf-8?B?dGhpcyBTdWJqZWN0IHdhcyBVVEYtOCBlbmNvZGVk?MIME-Version: 1.0
Content-Type: multipart/mixed; boundary 1a114dd3e8fe9c56053d92f414
Content-Transfer-Encoding: 8bit
--001a114dd3e8fe9c56053d92f414
Content-Type: text/plain; charsetUTF-8
This is a test.
--001a114dd3e8fe9c56053d92f414--
$ mhfixmsg -file test1 -out - -decodeheader utf-8 | diff - test1
4c4
< Subject: this Subject was UTF-8 encoded
---
Subject: ?utf-8?B?dGhpcyBTdWJqZWN0IHdhcyBVVEYtOCBlbmNvZGVk?
David