Re: [Nmh-workers] mhfixmsg on a pathological mail

Hi Ken,

I would agree this is easy to miss, and it is confusing at the top of
the man page where you could definitely read it as saying the whole
file is a mhbuild composition file, rather than just the body.  Maybe
you'd be willing to add some man page changes for 1.7?


I think 1.7 should be pushed out the door as soon as it's decided we're
happy with its new features, i.e. after -prefer's reversal as that looks
like Paul's favourite, and there's no faults that are trivially
triggered so users are worse off than 1.6.  And that's because there's a
ton of problems that will remain, but since they've been there a while,
e.g. mhbuild(1)'s confusion, typically without receiving complaint, they
can wait a bit longer.

Hm.  You know ... I can't actually get that to happen if #< is the
FIRST line.  The second line, yes.  Looking at m_getfld() it seems
that if we get something that isn't a header, we simply punt over it
(and silently eat it) and go on the assumption everything after that
point is the body.  So that seems right.


I agree that's what's happening;  the SEGV is only after a blank line.
Silently ignoring seems a bug IMO.  How much better if there was an
error about `#<foo/bar' being an invalid email header.  :-)

If I add a blank line before `#<' then we're in business; I get
foo/bar.  No CTE.  Is that OK because the default is 8bit?


Ummm .... no.  It's because it has no idea what foo/bar is and is
probably falling through some switch statement somewhere.


scan_content() perhaps.

Seriously, Ralph, what are you trying to accomplish here, other than
delay the release of 1.7? :-)


I'm trying to point out these are not 1.7 stoppers.  :-)
Nor's this.

    $ printf '%s\n' '' '#<foo/bar/xyzzy' 'Wot no wizard.' |
    > uip/mhbuild -
    MIME-Version: 1.0
    Content-Type: foo/bar
    Content-ID: 
<24855(_dot_)1504354753(_dot_)1(_at_)orac(_dot_)inputplus(_dot_)co(_dot_)uk>
    Content-MD5: +kx1Z6EwfyHjstJbq7OEHQ==

    Wot no wizard.
    $

Nor multiple encoding requests that violate the grammar being ignored,
with first one wins.

    $ printf '%s\n' '' '#<text/plain *qp *b64' a£d |
    > uip/mhbuild - |
    > grep -i content-transfer-encoding
    Content-Transfer-Encoding: quoted-printable
    $
    $ printf '%s\n' '' '#<text/plain *b64 *qp' a£d |
    > uip/mhbuild - |
    > grep -i content-transfer-encoding
    Content-Transfer-Encoding: base64

Nor multiple comments allowed, unlike the grammar, but only the one
positioned according to the grammar makes it through.

    $ printf '%s\n' '' '#<text/plain (foo) <id> (bar) [desc] (xyzzy) {disp}' 
a£d |
    > uip/mhbuild - |
    > egrep 'foo|bar|xyzzy'
    Content-Type: text/plain; charset="UTF-8" (foo)
    $
    $ printf '%s\n' '' '#<text/plain <id> (bar) [desc] (xyzzy) {disp}' a£d |
    > uip/mhbuild - |
    > egrep 'foo|bar|xyzzy'
    $

Report the problem to the user and either stop, or skip to the
"next" item, e.g. email to retrieve with POP3.  If those reports
turn out to be common, e.g. a behemoth like Gmail does it wrong,
then add code to violate the RFC.  If only oddball users need it,
then put it behind an option for their ~/.mh_profile.


Well, my beef there is actual users complain when they get those
warnings.


Good, else we'd never know they were triggered.  They are our guinea
pigs.  Hopefully, it won't be our fault too often, and when we can point
out it's a violation by something else then that may pacify them,
especially if we think it's likely to recur enough to workaround.  But
that's different from being slack in the first place, letting much pass
silently, allowing errors to compound, and discarding bits we don't
quite recognise in the hope that nobody wanted them.

My problem is that it's not clear how the calling code can "know" if
it wants \r\n or just \n.  Consider the case that got us here: user
wanted to run mhfixmsg(1) in their MTA in a step where \r\n appeared.
How is mhfixmsg supposed to know for that one case it needs to know
about \r\n?


If mhfixmsg(1) wanted to attempt to cope with that unusual case then it
could attempt to lex multiple times?  I don't think a program that's
trying to fix data that violates a grammar whilst using a parser for the
grammar is a good example.  :-)

-- 
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy

_______________________________________________
Nmh-workers mailing list
Nmh-workers(_at_)nongnu(_dot_)org
https://lists.nongnu.org/mailman/listinfo/nmh-workers

<Prev in Thread]	Current Thread	[Next in Thread>
Re: [Nmh-workers] mhfixmsg on a pathological mail, Ralph Corderoy Re: [Nmh-workers] mhfixmsg on a pathological mail, Ken Hornstein Re: [Nmh-workers] mhfixmsg on a pathological mail, Ralph Corderoy <= Re: [Nmh-workers] mhfixmsg on a pathological mail, Ralph Corderoy

Previous by Date:	Re: [Nmh-workers] mhfixmsg on a pathological mail, Ken Hornstein
Next by Date:	[Nmh-workers] multiple -prefer options, Paul Fox
Previous by Thread:	Re: [Nmh-workers] mhfixmsg on a pathological mail, Ken Hornstein
Next by Thread:	Re: [Nmh-workers] mhfixmsg on a pathological mail, Ralph Corderoy
Indexes:	[Date] [Thread] [Top] [All Lists]