ietf-dkim
[Top] [All Lists]

Re: [ietf-dkim] Possible problem with "simple" body canonicalization -- trailing CRLFs

2006-07-13 10:53:45
I'm sort of thinking out loud here.

One of the driving forces behind c=simple was to basically sign what was
there in its pristine form. In its simplest form, c=simple should have
just canonicalized the body exactly as it existed in the message.

Now there are a number of things that can occur to messages as they pass
through MSAs, MTAs, gateways, MDAs, mailing lists, etc. We essentially
prioritized those types of mungings and decided that we needed to
protect c=simple from at least one case. And that is, a fairly common
occurrence is that a message will obtain an extra blank line. (Perhaps
by MTAs incorrectly interpreting CRLF.CRLF? :-) ) So c=simple was
changed to handle this case by removing extra blank lines.

I can see us going two ways with this:
    *   Declare that if there no extra lines or even a line ending,
        treat the input as complete.
    *   Declare that if there is no ending CRLF, we should guarantee
        that there is one.

Note that the first is consistent with the  original intent of simple
(and matches what my code currently does). However, the second is
consistent with handling extra lines potentially being added to the
message somewhere during transition. Since we can't tell *if* an extra
line is *going* to be added to a message during transition, and we need
to make certain that the canonicalization on the signing side and the
verifying side see the same input no matter what happened in between, we
should go with the second method.

Eric, you convinced me. (And I'll change my code accordingly. :-) )

Now, as far as binary messages are concerned, we already have statements
saying that a message SHOULD be rendered in a 7bit-compatible MIME
format, which means one of 7bit, quoted-printable or base64. All of
these require textual line-based messages.

        Tony Hansen
        tony(_at_)att(_dot_)com

Eric Allman wrote:
Great!  Thanks for the pointer.  So it's not a problem with DATA.

Do we want to worry about the BDAT case?  My suggestion would work for
that (but break your implementation).

It also occurs to me that there is an assumption in DKIM (and probably a
lot of other specs) that the mail body is text.  It doesn't have to be
if you use BODY=BINARYMIME [RFC3030].  Is this worth considering?  It
seems like it would either require redefining simple body
canonicalization if you have a BINARYMIME message, or defining a new
canonicalization ("binary"?) that allows absolutely no changes.  Of
course, a new canonicalization can be added later.

eric



--On July 13, 2006 10:12:29 AM -0400 Tony Hansen <tony(_at_)att(_dot_)com> 
wrote:

This not a problem when using DATA. Check 2821 section 4.1.1.4; the
ending crlf.crlf was clarified as being the trailing crlf of the
last line of the message followed by the terminator sequence.

   Note that the first <CRLF> of this terminating sequence is also
the    <CRLF> that ends the final line of the data (message text)
or, if    there was no data, ends the DATA command itself.

You are correct that the problem exists when using BDAT.

My implementation uses the last CRLF in if it's there. If there is
no last CRLF, it does *not* add one.

    Tony Hansen
    tony(_at_)att(_dot_)com

Eric Allman wrote:
In doing an end-to-end read on -base this morning I came up with a
possible difficulty with simple body canonicalization.

The description says that simple reduces CRLF 0*CRLF at the end of
a message to a single CRLF.  But what if the message has no
trailing CRLF at all?

Remember that in the sequence CRLF . CRLF, both CRLFs are part of
the terminator, not part of the message.  Thus, a message reading:

       thanks <CRLF>
       . <CRLF>

in fact has no trailing CRLF on the message.  (This could also
happen with BDAT, but that's not widely implemented.)

An approach to this might be to say that simple reduces 0*CRLF to a
single CRLF (which is quite possibly what the current
implementations actually do).



_______________________________________________
NOTE WELL: This list operates according to 
http://mipassoc.org/dkim/ietf-list-rules.html