Re: Comments on Malformed Message BCP draft

I'm strongly opposed to MTAs "fixing" malformed messages (other than submission 
servers fixing a small number of known problems caused by broken mail clients).
If an MTA does anything at all when it thinks that a message is malformed, it 
should be to bounce it _exactly as it received it originally_.

MTAs trying to fix malformed messages, at best, mask problems further upstream 
that should be fixed.   At worst, they exacerbate existing problems and make 
such problems harder to diagnose.

Keith

On Apr 14, 2011, at 3:07 PM, Murray S. Kucherawy wrote:

This is some work starting up in the APPS area.  Please comment on the 
apps-discuss list if you’re interested in participating.

From: apps-discuss-bounces(_at_)ietf(_dot_)org 
[mailto:apps-discuss-bounces(_at_)ietf(_dot_)org] On Behalf Of Simon Tyler
Sent: Thursday, April 14, 2011 2:59 AM
To: apps-discuss(_at_)ietf(_dot_)org
Subject: [apps-discuss] Comments on Malformed Message BCP draft

Hi,

Having read the Malformed Message BCP draft I am interested in getting some 
discussion going on its content and to find the best way forward.

For those who missed it, the draft is at:

https://www1.tools.ietf.org/html/draft-kucherawy-mta-malformed-00

I have a few comments on it.

One thing that concerns me is the proposal that processing should stop when 
certain malformations are identified.

For example it is proposed that should a badly wrapped header field be found 
(quite how we define this is left open, I presume a line that does not start 
with a valid header field name followed by a colon) then the processing agent 
should treat this as the end of the header. My feeling is that this opens up 
a greater potential hole than the one closed and that can be exploited.

An example of the type of issue this could is cause is that should such a 
malformation occur before the MIME header fields in the header then this 
would cause the rest of the header and the message body to be treated as 
plain text. This could cause content analysis system to fail as they would 
not interpret the MIME content in the way that was presumably intended.

Given that these recommendations are unlikely to be followed by all clients 
and servers, I feel that this suggestion will allow content through an agent 
without suitable processing. My preference on this particular malformation 
would be to continue processing the header fields, this is based on the 
assumption that what follows the malformed header field is more likely to be 
further header fields and not body content. What we do with the malformed 
header field I am less certain about. We could just ignore it or we could 
treat it as part of the previous header field - both will be right as often 
as they wrong. I would welcome some additional thoughts on this.

I have similar feelings about some of the other suggestions including the 
lack of a MIME-Version header. We cannot ignore intended meaning just because 
a non-compliant application made a small error in the header fields. Everyone 
will be best served if we subject such messages to more analysis, not less.

On the whole I think a set of guidelines in this area is good but it will be 
hard to reach consensus without agreement on some basic underlying 
principles.  I would suggest that one such principle is to try to do what the 
intended recipient would most likely prefer, which is generally to fix and 
deliver non-spam messages.

I would also propose some additions to the draft. At Mimecast we see a lot of 
messages generated by all sorts of systems and amongst these we see a lot of 
different kinds of message malformations.

I'll suggest more as I think of them but for starters here are a few:

1. Excessively long lines in both headers and body. I commonly see lines that 
are several hundred Kbs in length. These are often valid messages in the 
sense that the content is desired by the receiver and in all respects other 
than line length are well formed. Obviously a limit has to be enforced and I 
would like to find a consensus on what sort of limit is reasonable. Initially 
I felt 8K was a good limit - it is after all 8 times the limit in RFC 5321. 
But it appears that this is too small a limit in real situations. When the 
limit is exceeded, what is the best option – a rejection or  forced line 
wrap. I am open to both. 

2. Invalid characters in headers. I often see headers with un-encoded 8bit 
characters. These are often displayed correctly to the recipient where the 
client happens upon the correct character set by virtue of chance.

3. 8bit characters in MIME sections with a content-transfer-encoding of 7bit.

If you have read this far then I think you will agree with me that Murray has 
made a good start on a much needed set of guidelines. Now let's see if we can 
support him to expand on the work he has done and reach a consensus on the 
best approaches.

Simon
<ATT00001..txt>