On Fri, 15 Dec 2006 19:07:47 -0000, Eric Allman <eric+dkim(_at_)sendmail(_dot_)org>
wrote:
--On December 15, 2006 10:17:16 AM -0800 Mark Delany
<markd+dkim(_at_)yahoo-inc(_dot_)com> wrote:
FWIW, this problem was similarly discovered in DK. The early text
read:
-01
o All trailing empty lines are ignored. An empty line is a line
of
zero length after removal of the local line terminator. The
empty line that separates the header from the body is a to be
included in this process.
and the later text read:
-06
o All trailing empty lines are ignored. An empty line is a
line of
zero length after removal of the local line terminator.
If the body consists entirely of empty lines, then the
header/body line is similarly ignored.
The "simple" in DKIM, as I understand it, is merely re-codifying the
same function.
Nope!
I *think* there are three cases here:
(1) Some body with l=0. I think we're agreed that this should result in
an empty input into the hash algorithm.
(2) Header, CRLF, no body; that is, the input to a DATA would be:
Header: foobar<CRLF>
<CRLF>
<CRLF>.<CRLF>
The last line of the example is the separator, so the initial CRLF
doesn't count.
(3) Header, no CRLF, no body:
Header: foobar<CRLF>
<CRLF>.<CRLF>
It seems to me that all three of these should match after
canonicalization. I'm not sure the -07 draft is explicit enough to
ensure this.
Nope!
Let us be VERY careful here. Start from RFC 2822:
message = (fields / obs-fields)
[CRLF body]
body = *(*998text CRLF) *998text
So a <body> can be EMPTY, and its last line might not have a CRLF.
The CRLF following the header fields is NOT part of the <body>.
If the <body> is absent (indistinguishable from an empty <body>) that CRLF
after the header fields can be omitted.
Now look at RFC 2821:
The mail data is terminated by a line containing only a period, that
is, the character sequence "<CRLF>.<CRLF>" (see section 4.5.2). This
is the end of mail data indication. Note that the first <CRLF> of
this terminating sequence is also the <CRLF> that ends the final line
of the data (message text) or, if there was no data, ends the DATA
command itself.
So, even if you have a body with no CRLF, as permitted by RFC 2822, you
can't actually transmit it by RFC 2821 (well, you might transmit it by
UUCP, and you might encapsulate in in a message/rfc822 within some
multipart).
So we have the following cases. The dotted lines enclose what is, by RFC
2822 definition, the <body>, and is therefore what will get hashed or
canonicalized by dkim-base, as currently worded. The ".CRLF" is the RFC
2821 DATA terminator.
1) ordinary message with <body> of 1 non-empty line:
Last-Header: foobarCRLF
CRLF
---------------------
barbazCRLF
---------------------
.CRLF
2) <body> consisting of 2 empty lines
Last-Header: foobarCRLF
CRLF
---------------------
CRLF
CRLF
---------------------
.CRLF
3) <body> consisting of 1 empty line
Last-Header: foobarCRLF
CRLF
---------------------
CRLF
---------------------
.CRLF
4) <body> containing no lines
Last-Header: foobarCRLF
CRLF
---------------------
---------------------
.CRLF
5) message with absent <body>
Last-Header: foobarCRLF
.CRLF
Now apply simple canonicalization to all those cases, using:
"In more formal terms, the "simple" body canonicalization algorithm
converts "0*CRLF" at the end of the body to a single "CRLF"."
Making the entirely reasonable assumption that "body" means exactly what
RFC 2822 defines it to mean, then here is what gets hashed in all of those
cases:
1) ordinary message with <body> of 1 non-empty line:
---------------------
barbazCRLF
---------------------
2) <body> consisting of 2 empty lines
---------------------
CRLF
---------------------
3) <body> consisting of 1 empty line
---------------------
CRLF
---------------------
4) <body> containing no lines
---------------------
CRLF
---------------------
5) message with absent <body>
---------------------
---------------------
That is undoubtedly what the "formal terms" in dkim-base undoubtedly SAY.
It is NOT what the "informal" words in dkim-base say.
It is NOT what version -01 of DK says.
It is NOT what version -06 of DK says.
It is NOT what Eric's three examples claim.
It is entirely possible that is is NOT what dkim-base was INTENDED to say.
--
Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131
Web: http://www.cs.man.ac.uk/~chl
Email: chl(_at_)clerew(_dot_)man(_dot_)ac(_dot_)uk Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9 Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5
_______________________________________________
NOTE WELL: This list operates according to
http://mipassoc.org/dkim/ietf-list-rules.html