ietf-dkim
[Top] [All Lists]

Re: [Fwd: Re: [ietf-dkim] canonicalized null body and dkim]

2006-12-18 05:11:20
On Fri, 15 Dec 2006 19:07:47 -0000, Eric Allman <eric+dkim(_at_)sendmail(_dot_)org> wrote:

--On December 15, 2006 10:17:16 AM -0800 Mark Delany <markd+dkim(_at_)yahoo-inc(_dot_)com> wrote:

FWIW, this problem was similarly discovered in DK. The early text
read:

-01
    o All trailing empty lines are ignored. An empty line is a line
of
       zero length after removal of the local line terminator. The
       empty line that separates the header from the body is a to be
       included in this process.


and the later text read:

-06
     o All trailing empty lines are ignored. An empty line is a
line of
       zero length after removal of the local line terminator.

       If the body consists entirely of empty lines, then the
       header/body line is similarly ignored.

The "simple" in DKIM, as I understand it, is merely re-codifying the
same function.

Nope!


I *think* there are three cases here:

(1) Some body with l=0. I think we're agreed that this should result in an empty input into the hash algorithm.

(2) Header, CRLF, no body; that is, the input to a DATA would be:

        Header: foobar<CRLF>
        <CRLF>
        <CRLF>.<CRLF>

The last line of the example is the separator, so the initial CRLF doesn't count.

(3) Header, no CRLF, no body:

        Header: foobar<CRLF>
        <CRLF>.<CRLF>

It seems to me that all three of these should match after canonicalization. I'm not sure the -07 draft is explicit enough to ensure this.

Nope!

Let us be VERY careful here. Start from RFC 2822:

message         =       (fields / obs-fields)
                        [CRLF body]
body            =       *(*998text CRLF) *998text

So a <body> can be EMPTY, and its last line might not have a CRLF.

The CRLF following the header fields is NOT part of the <body>.

If the <body> is absent (indistinguishable from an empty <body>) that CRLF after the header fields can be omitted.

Now look at RFC 2821:

   The mail data is terminated by a line containing only a period, that
   is, the character sequence "<CRLF>.<CRLF>" (see section 4.5.2).  This
   is the end of mail data indication.  Note that the first <CRLF> of
   this terminating sequence is also the <CRLF> that ends the final line
   of the data (message text) or, if there was no data, ends the DATA
   command itself.

So, even if you have a body with no CRLF, as permitted by RFC 2822, you can't actually transmit it by RFC 2821 (well, you might transmit it by UUCP, and you might encapsulate in in a message/rfc822 within some multipart).

So we have the following cases. The dotted lines enclose what is, by RFC 2822 definition, the <body>, and is therefore what will get hashed or canonicalized by dkim-base, as currently worded. The ".CRLF" is the RFC 2821 DATA terminator.

1) ordinary message with <body> of 1 non-empty line:

Last-Header: foobarCRLF
CRLF
---------------------
barbazCRLF
---------------------
.CRLF

2) <body> consisting of 2 empty lines

Last-Header: foobarCRLF
CRLF
---------------------
CRLF
CRLF
---------------------
.CRLF

3) <body> consisting of 1 empty line

Last-Header: foobarCRLF
CRLF
---------------------
CRLF
---------------------
.CRLF

4) <body> containing no lines

Last-Header: foobarCRLF
CRLF
---------------------
---------------------
.CRLF

5) message with absent <body>

Last-Header: foobarCRLF
.CRLF

Now apply simple canonicalization to all those cases, using:

   "In more formal terms, the "simple" body canonicalization algorithm
    converts "0*CRLF" at the end of the body to a single "CRLF"."

Making the entirely reasonable assumption that "body" means exactly what RFC 2822 defines it to mean, then here is what gets hashed in all of those cases:

1) ordinary message with <body> of 1 non-empty line:
---------------------
barbazCRLF
---------------------

2) <body> consisting of 2 empty lines
---------------------
CRLF
---------------------

3) <body> consisting of 1 empty line
---------------------
CRLF
---------------------

4) <body> containing no lines
---------------------
CRLF
---------------------

5) message with absent <body>
---------------------
---------------------

That is undoubtedly what the "formal terms" in dkim-base undoubtedly SAY.

It is NOT what the "informal" words in dkim-base say.
It is NOT what version -01 of DK says.
It is NOT what version -06 of DK says.
It is NOT what Eric's three examples claim.
It is entirely possible that is is NOT what dkim-base was INTENDED to say.

--
Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131     Web: http://www.cs.man.ac.uk/~chl
Email: chl(_at_)clerew(_dot_)man(_dot_)ac(_dot_)uk      Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9      Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5
_______________________________________________
NOTE WELL: This list operates according to http://mipassoc.org/dkim/ietf-list-rules.html