ietf-mailsig
[Top] [All Lists]

Re: nowsp considered harmful

2005-07-20 15:53:11

I'm being swayed to the idea that nowsp might open up some non-trivial security holes. Some background:

We chose nowsp as a canonicalization algorithm that would survive arbitrary wrapping in the belief that this was not particularly uncommon. In particular, many MUAs today seem to ignore the 1000 octet line-limit, and to have them fail DKIM verification seems like a bad thing. Ignoring arbitrary white space allows arbitrary wrapping, assuming that no escape characters are added, but appears to create other problems.

I'm especially concerned that this may not be limited to some sort of interaction with l=. I have no obvious example, but the fact that internal MIME headers convey semantics that are based on line boundaries concerns me greatly.

Ignoring Goldilocks, Pollyanna, or even Dr. Pangloss, I think we can do better here, even if it means condemning some innocent messages into spam purgatory. In particular, I think I've been convinced that line wraps do, in many important circumstances, have semantic meaning. I'm still not convinced about spaces vs. tabs --- I think we need more data to see if such changes actually occur in the wild in any significant way.

In short, I'm inclined to support Earl Hood's "minwsp" algorithm, perhaps modulo the tab vs. spaces problem.

eric




--On July 20, 2005 10:05:47 AM +0200 Thomas Roessler <tlr(_at_)w3(_dot_)org> wrote:


nowsp, when combined with the length parameter, can enable attackers
to completely replace the e-mail content displayed by mail user
agents, without invalidating the DKIM signature.


Consider a message that has a multipart/mixed as its top level body
part, with boundary parameter "foobar", and which is fully signed,
with a length parameter (l=) that ensures that the signature
includes the final delimiter line.

The body of this message could, for example, look like this:

        |--foobar
        |Content-Type: text/plain
        |
        |nowsp, when combined with the length parameter, ...
        |
        |--foobar--

Anything before the initial "--foobar" is ignored, as is anything
after the final "--foobar--".

nowsp means that we can freely move line breaks or space characters
without invalidating the signature.  Let's do that.

        |--foo
        |barContent-Type: text/plain
        |nowsp, when combined with the length parameter, ...
        |
        |--foo
        |bar--

This message is, for all intents and purposes, the same as the
original one -- at least, as far as DKIM is concerned.  In terms of
MIME, we have just removed any occurence of multipart delimiters.


Use of the length parameter means that the attacker can freely add
content.

Let's do that now:

        |--foo
        |barContent-Type: text/plain
        |nowsp, when combined with the length parameter, ...
        |
        |--foo
        |bar--
-->  +
        +--foobar
        +Content-Type: text/plain
        +
        +nowsp, when combined with the length parameter, solves
        +all e-mail security problems.
        +
        +--foobar--

The material before the "-->" is called the preamble of the
multipart, and MUST be ignored according to RFC 2046.  It's the
signed material.  Everything behind the "-->" is what's really
displayed. It's what the attacker added.


This problem should be solved on the level of the canonicalization
mechanism, not on the level of maybe displaying appended material
differently.

Note that, even without the length parameter, messages can be
corrupted heavily (to the extent of not being displayed at all) by
moving around whitespace, and still be displayed as signed.  That
could open up the way for what may be an attack against DKIM
deployment: What happens when people start receiving tons of
meaningless e-mails, all of which are DKIM-signed by their bank?
Also, will they continue to take the mecahnism seriously when they
get tons of messages, signed by their bank, with all the content
colored "insecure"?


The lesson here is that it's not enough to think of the semantics of
an e-mail body in terms of a human being staring at garbled
text/plain: Rather, whatever canonicalization method is going to be
used by DKIM ought to protect semantics of full MIME parts,
including multipart delimiter lines and individual bodies' headers.

Everything else (including the "we don't care about bodies'
semantics, this is header signing" school of thought) is a recipe
for more issues like the one described above.

Regards,



<Prev in Thread] Current Thread [Next in Thread>