ietf-smtp
[Top] [All Lists]

ambiguities in RFC 2821 regarding end of mail data

2003-06-20 13:50:27

I am writing code to parse incoming SMTP bytes, and I have encountered two ways in which RFC 2821 seems ambiguous regarding handling of the end of mail data indicator. I hope that this list is the appropriate place for me to make this report.

The first ambiguity concerns this question: When I receive this sequence CRLF.CRLF, should the data which I capture include the first CRLF or not?

I believe that the first CRLF should be part of the data, according to
my reading of RFC 2821, sections 2.3.7, 3.3, and 4.1.1.4.  That CRLF
is part of the last line of data, since, as I read 2.3.7 you don't have a "line" of data unless it is terminated by CRLF.

But the RFC leaves room for confusion.  This phrase from section 3.3
might be interpreted both ways, "...the SMTP server ... considers all
succeeding lines up to but not including the end of mail data
indicator to be the message text."  What is "the end of mail data
indicator"?  Section 4.1.1.4 calls the whole CRLF.CRLF the "end of
mail data indication".

I noticed this ambiguity because the MTA with which I am familiar, james.apache.org, uses a parsing routine which presently does not return the first CRLF as part of the mail data. In conformity with one of the ways that the RFC can be understood, it checks for the whole end of mail data indicator (CRLF.CRLF) and sends EOF without returning any of that indicator including the first CRLF.

The second ambiguity relates to the first, and it concerns the question: When I receive a period alone in the first line of mail data should I consider that the end of the data?

I believe that I should consider the sequence .CRLF to be the end of the data if those are the first three bytes received by the data parsing routine. But again the RFC seems unclear to me. The parsing routine presently used by the MTA with which I am familiar does not treat a period alone in the first line as the end of data; it waits for the sequence CRLF.CRLF which could only be in the second line or later.

Rich Hammer
Hillsborough, N.C.