On Mon, 08 Jan 2007 17:27:49 -0000, Eric Allman <eric+dkim(_at_)sendmail(_dot_)org>
wrote:
--On January 8, 2007 11:51:13 AM +0000 Charles Lindsey
<chl(_at_)clerew(_dot_)man(_dot_)ac(_dot_)uk> wrote:
Indeed there is no ambiguity in that, but that is because you have
only quoted half the text. The full text is:
The "simple" body canonicalization algorithm ignores all empty
lines
at the end of the message body. An empty line is a line of zero
length after removal of the line terminator. If there is no
trailing
CRLF on the message, a CRLF is added. It makes no other
changes to
the message body. In more formal terms, the "simple" body
canonicalization algorithm converts "0*CRLF" at the end of the
body
to a single "CRLF".
Observe carefully that the text some times tells you to consider
the "message", and somtimes the "message body" (which I take to
mean exactly the <body>, if any, defined by RFC 2822).
OK, I'll change the single instance of "message" to "message body".
Well that certainly fixes the case I gave, but then the question arises as
to whether it has fixed it in the "right" direction. I still, cannot see
why one would want to canonicalize a genuinely empty <body> intpo "CRLF",
nor do some others on this list. I gather that some implementations do not
do so either, and neither did the original DK, and many on the list
thought they were standardizing the DK behaviour.
Moreover, there remains another case that is ambiguous. Consider:
Field: foobar<CRLF>
.<CRLF>
That is a valid RFC 2822 message with NO <body> at all (which is NOT the
same thing as an empty <body>). Let us apply your revised wording.
The "simple" body canonicalization algorithm ignores all empty lines
at the end of the message body.
There is no body, so no action is needed.
An empty line is a line of zero
length after removal of the line terminator.
Not needed.
If there is no trailing
CRLF on the message BODY, a CRLF is added.
That is your modified wording. There is no <body> so no action occurs.
(But it would have been the same with your original wording.)
It makes no other changes to
the message body.
There is no <body>, so there is nothing to not make any changes to. But we
are still finished.
So what do we pass to the canonicalization? It doesn't say, but the only
reasonable intpretation would be to pass <empty>. So it appears that an
absent body canonicalizes differently to an empty body.
Now let us compare that with the final sentence:
In more formal terms, the "simple" body
canonicalization algorithm converts "0*CRLF" at the end of the body
to a single "CRLF".
Since there is no <body>, there is nothing to do, so it indeed agrees with
the first 4 sentences in this case.
But we still have the bizarre situation that an absent body is treated
differently from an empty body. Can you please confirm that this was your
intention?
Consider the example, in DATA format:
Field: foobar<CRLF>.
<CRLF>
<CRLF>
.<CRLF>
As Hector points out, this is not a valid 2822 message to begin with, ...
Sure, there was a typo, as various people spotted.
--
Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131
Web: http://www.cs.man.ac.uk/~chl
Email: chl(_at_)clerew(_dot_)man(_dot_)ac(_dot_)uk Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9 Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5
_______________________________________________
NOTE WELL: This list operates according to
http://mipassoc.org/dkim/ietf-list-rules.html