ietf-smtp
[Top] [All Lists]

Re: ambiguities in RFC 2821 regarding end of mail data

2003-06-20 15:48:44

Keith Moore wrote:

The second ambiguity relates to the first, and it concerns the question: When I receive a period alone in the first line of mail data
should I consider that the end of the data?

you mean, you receive the sequence

44 41 54 41 0D 0A 2E 0D 0A
?

Yes, I suppose that would be the sequence which I am asking about. But the way we do our parsing of data I do not see it all together that way, since the first OD OA are consumed by a routine which reads command lines (which are recognized as lines because they end with OD OA). When DATA is recognized in a command line then the InputStream is handed to another parsing routine, specialized for reading data, which starts reading where the command-line parser left off, after the first OD OA.

no it's not the end of mail data, since the CR LF after DATA
cannot be part of the end-of-data marker.  OTOH it's also an SMTP
protocol error,

This exchange has stimulated me to do the homework which I probably should have done before I started. I have just seen this text in RFC 2821, section 4.1.1.4, second paragraph:

"The mail data is terminated by a line containing only a period, that is, the character sequence "<CRLF>.<CRLF>" (see section 4.5.2). This is the end of mail data indication. Note that the first <CRLF> of this terminating sequence is also the <CRLF> that ends the final line of the data (message text) or, if there was no data, ends the DATA command itself."

I note two things from this. First, "if there was no data, ends the DATA command itself", allows the possibility that there may be no data. And it seems to answer my second question: a period alone in the first line of data ends the data.

Second, turning back now to the first question which I asked, notice that the first sentence of that paragraph from RFC 2821 contradicts itself. The character sequence "<CRLF>.<CRLF>" is not "a line containing only a period". It a CRLF -- put there to show that we have really reached the end of the prior line -- followed by a line containing only a period. So this sentence gives us two possible "end of mail data indication"s. One is "<CRLF>.<CRLF>" and the other is ".<CRLF>" alone in a line. They are not the same. But I think the later is more consistent with other material in this RFC.

Rich