[Top] [All Lists]

Re: [nmh-workers] Formatting HTML to Text: netrik.

2019-06-27 19:48:57
 - replfilter depends on 'par', which users may not have installed, but
   the failure mode in its absence is suboptimal: the quoted text is
   absent, and the error shown on the command line is something like

     Pipe reader process exited with 72057594037927935

   .. could well be enough just to list the required support tools in
   the instructions at the start of replfilter.

Fair enough; it's down a bit farther, but putting a bit at the top saying
you might want to look at that (and the HTML converter) would be helpful.
You could also use "fmt", but I've found par tends to work better.

 - Reading the manpage I see that .mh_profile comments are supposed to
   start '#:', but lines starting '#' give every practical appearance
   of working as comments too — I've <ahem> wrongly used that for years
   — except that if you have two of them in a row, nmh commands fail
   with "no blank lines are permitted in .mh_profile".  The error at
   least seems confused.  Is the colon in '#:' just for easy parsing?
   Would it break things for # to introduce a comment?

Weeeellll.... yes.

You may notice that on closer inspection that mh-profile actually looks
a lot like a message header, in that no blank lines are permitted and
it consists of a header field name ('name:') and header field text.
This is not a coincidence.  The same routine used for parsing message
headers is also used to parse the profile (and context files, and
message sequence files ... sigh).

So we'd have to either introduce some special-case code during profile
parsing, or change the email parser code (ugh).  Both of these are hard;
the function used during message parsing (m_getfld()) really takes over
the input stream and does a fair amount of caching for efficiency, so
you can't easily look at the input stream and say, 'Oh, this starts with
a #, skip this line'.  And for changing m_getfld() ... well, take a look
at it sometime and tell me if YOU want to mess with it.  Welcome to
how the sausage is made :-/

What you're doing when you put in '#:' is creating a special profile
entry called '#' which is not used for anything, but it looks like a
header field just enough that m_getfld() is happy.  I forget who
pointed this out on the mailing list many years ago, but the easiest
solution was just to document the current behavior.

I have been playing around with a flex-based email header parser; if
that gets working it would be very easy to create a slight modification
that handled '#' based comments for things like the profile.  That
would be part of my "full MIME parsing" work and wouldn't be done for
a while.

Finally, I'd almost be inclined to have nmh-without-replfilter display a
message about replfilter, for example maybe in whatnowproc, so after
grinding your teeth about the undecoded base64 you at least see a
message suggesting a remedy for this after exiting the editor.  I
realise though that accurate detection of circumstances where it would
be helpful to display such a message might not easy, but it would save a
certain amount of repetition.

I am sympathetic to that idea ... the problem is that requires making
mhl be smarter about what is and isn't a MIME body.  Right now mhl knows
nothing of MIME, it just sees the message body as one big text blob.
Making it do MIME is a lot of work.  In a perfect "full MIME parsing"
world it would just DTRT and replyfilter could vanish.  There are sadly
no wonderful solutions.



<Prev in Thread] Current Thread [Next in Thread>