nmh-workers
[Top] [All Lists]

Re: [Nmh-workers] mhfixmsg on a pathological mail

2017-09-01 12:38:06
The first mention mhbuild(1) mentions of the input format is

   An mhbuild “composition file” is just a file containing plain
   text that is interspersed with various mhbuild directives.

In the next paragraph it says "Basically, the body contains one or more
contents", and that's the first suggestion there might be something
other than the body.  The grammar at the end starts

   The following is the formal syntax of a mhbuild “composition
   file”.

       body ::= 1*(content | EOL)

suggesting it's nothing but a body as that's the start symbol.

Ha!  Fair enough.  I can't defend the man page organization (it's
all cobbled together from the old "mhn" program), but down below under
"Invoking mhbuild" it does say:

       Typically,  mhbuild  is  invoked  by the whatnow program.  This command
       will expect the body of the draft to be formatted as an mhbuild  compo-
       sition  file.

I would agree this is easy to miss, and it is confusing at the top of
the man page where you could definitely read it as saying the whole file
is a mhbuild composition file, rather than just the body.  Maybe you'd
be willing to add some man page changes for 1.7?

Which suggests it did read the `#<...' line as a directive rather than
just some line of text to skip, or a arbitrary header to skip before the
blank line separating the headers and body.

Hm.  You know ... I can't actually get that to happen if #< is the FIRST
line.  The second line, yes.  Looking at m_getfld() it seems that if we
get something that isn't a header, we simply punt over it (and silently
eat it) and go on the assumption everything after that point is the
body.  So that seems right.

This pipeline behaves as you suggest.  A directive as the first line
produces text/plain regardless, but it must have known what the `#<' was
because it doesn't appear as part of the text/plain's content.

Again, I think that's because m_getfld() is silently eating that line.

If I add a blank line before `#<' then we're in business; I get
foo/bar.  No CTE.  Is that OK because the default is 8bit?

Ummm .... no.  It's because it has no idea what foo/bar is and is
probably falling through some switch statement somewhere.  I guess we
should treat that as application/octet-stream.  The default CTE when one
is not supplied is 7bit.  Seriously, Ralph, what are you trying to
accomplish here, other than delay the release of 1.7? :-)

First ... when we get invalid input, how should we react?  It's a fair
question.

Yes.  Report the problem to the user and either stop, or skip to the
"next" item, e.g. email to retrieve with POP3.  If those reports turn
out to be common, e.g. a behemoth like Gmail does it wrong, then add
code to violate the RFC.  If only oddball users need it, then put it
behind an option for their ~/.mh_profile.

Well, my beef there is actual users complain when they get those warnings.

We need to cope with errors anyway, e.g. I/O problems on the fsync(2)
means POP3's DELE shouldn't be issued.

Sure, but .... that seems to happen rarely in practice.  Getting email
that doesn't conform to RFCs happens a lot more :-/

If it were to allow /\r?\n/ then I think it should insist on
consistency for all the lines based on the first.  But really, the
lexer should be told which one of the two is valid at the start.

Another switch to add to all programs?  Ugh.

No, I mean that the code calling the lexer knows at the start what line
ending is acceptable and should tell the lexer, e.g. mbox lexing wants
/\n/ and any /\r/ seen is part of the line, not the line's terminator.
Nothing required from the user.

My problem is that it's not clear how the calling code can "know" if
it wants \r\n or just \n.  Consider the case that got us here: user
wanted to run mhfixmsg(1) in their MTA in a step where \r\n appeared.
How is mhfixmsg supposed to know for that one case it needs to know
about \r\n?

--Ken

_______________________________________________
Nmh-workers mailing list
Nmh-workers(_at_)nongnu(_dot_)org
https://lists.nongnu.org/mailman/listinfo/nmh-workers

<Prev in Thread] Current Thread [Next in Thread>