nmh-workers
[Top] [All Lists]

Re: [Nmh-workers] Weird behavior with non-ascii code in headers

2013-06-26 21:06:54
Well, my thought is to present errors to the user for manual
intervention.  After all, if a person is smart enough to use nmh,
they're smart enough to figure out how to fix a header line, right?  :D

I know you were just kidding, but I am just trying to figure out if
that makes sense.  For one thing, what does that do for people who deal
with some of the nmh front-ends?

Off the cuff, I'd guess that there would be three cases, 1)
envelope-related headers such as From, To, Return-Path, Subject,
Message-ID and Date, 2) content-related headers like Content-Type and
Content-Transfer-Encoding, and 3) delivery/other information headers
like Received and X-*.  (I'm sure there are more appropriate terms for
these categories, but I can't think of them now.)  We'd need to validate
that From, To and Return-Path are addresses and that Subject and
Message-ID are appropriate strings.  I think that Date would probably be
the hardest envelope-related header to automatically correct.

So, it occurs to me there is already some smarts in the format engine
regarding these things.  E.g, an invalid Date: header generally gets
handled ok.  And in fact if there's a address that can't be parsed,
that's handled fine as well.  The problem here really more of a bug in
the I18N handling _at the output stage_, and is really kinda obscure;
the issue in my mind is: how do we deal with that?  This also comes up
with other headers, like Subject; I expect the same thing would happen
there.  Part of me thinks that if we run into a non-ASCII character in
a header in the format engine we should simply replace it with a "?".

Validating content-related headers would be interesting.  If we validate
the header lines themselves, shouldn't we make sure that they represent
the actual content?  (Cue MIME discussion.)  As for the other headers,
wouldn't we just have to ensure that they're appropriately encoded
strings?  Maybe prepend "X-Malformed-Header:" to a line we couldn't
automatically fix?  And of course, we should validate continuations for
all lines.  I'll look into it when I get a chance, but I find it
fascinating that scan couldn't figure out the proper date or subject
when it ran into invalid continuations.

Sigh.  You can read the discussion in the archives about auto-fixing of
malformed messages, but it seemed to me the rough consensus was that
it should NOT happen automatically.  Also, if you can figure out how to
do that on a reliable basis, you're smarter than I.

I'm looking at inc.c and I'm not seeing the code you mention; can you point
me to it?

Part of a commented out function called cpymsg near line 980.  Not much
there, actually...  (Ah, wait, I was using the 1.5 release tarball
source.  I just cloned the git repository and that part is gone.)

Ok, I went and looked at that ... that was NOT about fixing a "From:"
header, it was about fixing a "From " header ... in other words, the
mailbox message separator.

I have not yet seen a message/global MIME type in the wild.  When we start
seeing them I think we should care.  Are people seeing these messages?

I'm not actually seeing anything like this yet, but I'm always
interested in trying to take the future into account when planning
changes...

Well, I think we're still working on bringing nmh into the early 2000s :-)

--Ken

_______________________________________________
Nmh-workers mailing list
Nmh-workers(_at_)nongnu(_dot_)org
https://lists.nongnu.org/mailman/listinfo/nmh-workers

<Prev in Thread] Current Thread [Next in Thread>