I'm being swayed to the idea that nowsp might open up some
non-trivial security holes. Some background:
We chose nowsp as a canonicalization algorithm that would survive
arbitrary wrapping in the belief that this was not particularly
uncommon. In particular, many MUAs today seem to ignore the 1000
octet line-limit, and to have them fail DKIM verification seems like
a bad thing. Ignoring arbitrary white space allows arbitrary
wrapping, assuming that no escape characters are added, but appears
to create other problems.
I'm especially concerned that this may not be limited to some sort of
interaction with l=. I have no obvious example, but the fact that
internal MIME headers convey semantics that are based on line
boundaries concerns me greatly.
Ignoring Goldilocks, Pollyanna, or even Dr. Pangloss, I think we can
do better here, even if it means condemning some innocent messages
into spam purgatory. In particular, I think I've been convinced that
line wraps do, in many important circumstances, have semantic
meaning. I'm still not convinced about spaces vs. tabs --- I think
we need more data to see if such changes actually occur in the wild
in any significant way.
In short, I'm inclined to support Earl Hood's "minwsp" algorithm,
perhaps modulo the tab vs. spaces problem.
eric
--On July 20, 2005 10:05:47 AM +0200 Thomas Roessler <tlr(_at_)w3(_dot_)org>
wrote:
nowsp, when combined with the length parameter, can enable attackers
to completely replace the e-mail content displayed by mail user
agents, without invalidating the DKIM signature.
Consider a message that has a multipart/mixed as its top level body
part, with boundary parameter "foobar", and which is fully signed,
with a length parameter (l=) that ensures that the signature
includes the final delimiter line.
The body of this message could, for example, look like this:
|--foobar
|Content-Type: text/plain
|
|nowsp, when combined with the length parameter, ...
|
|--foobar--
Anything before the initial "--foobar" is ignored, as is anything
after the final "--foobar--".
nowsp means that we can freely move line breaks or space characters
without invalidating the signature. Let's do that.
|--foo
|barContent-Type: text/plain
|nowsp, when combined with the length parameter, ...
|
|--foo
|bar--
This message is, for all intents and purposes, the same as the
original one -- at least, as far as DKIM is concerned. In terms of
MIME, we have just removed any occurence of multipart delimiters.
Use of the length parameter means that the attacker can freely add
content.
Let's do that now:
|--foo
|barContent-Type: text/plain
|nowsp, when combined with the length parameter, ...
|
|--foo
|bar--
--> +
+--foobar
+Content-Type: text/plain
+
+nowsp, when combined with the length parameter, solves
+all e-mail security problems.
+
+--foobar--
The material before the "-->" is called the preamble of the
multipart, and MUST be ignored according to RFC 2046. It's the
signed material. Everything behind the "-->" is what's really
displayed. It's what the attacker added.
This problem should be solved on the level of the canonicalization
mechanism, not on the level of maybe displaying appended material
differently.
Note that, even without the length parameter, messages can be
corrupted heavily (to the extent of not being displayed at all) by
moving around whitespace, and still be displayed as signed. That
could open up the way for what may be an attack against DKIM
deployment: What happens when people start receiving tons of
meaningless e-mails, all of which are DKIM-signed by their bank?
Also, will they continue to take the mecahnism seriously when they
get tons of messages, signed by their bank, with all the content
colored "insecure"?
The lesson here is that it's not enough to think of the semantics of
an e-mail body in terms of a human being staring at garbled
text/plain: Rather, whatever canonicalization method is going to be
used by DKIM ought to protect semantics of full MIME parts,
including multipart delimiter lines and individual bodies' headers.
Everything else (including the "we don't care about bodies'
semantics, this is header signing" school of thought) is a recipe
for more issues like the one described above.
Regards,