--- wayne <wayne(_at_)schlitt(_dot_)net> wrote:
I'm not much in to trying to reinvent the wheel, so I'm not going to
suggest any of my own, but I am curious:
What about using the canonicalization methods of DCC and/or Razor?
They have the advantage that they are very well field tested. On the
In the DKIM space, canonicalization has to serve two goals. One is
survivability across common mangling, the second is resistance to abuse. The
more flexibility the canonicalization algorithm allows, the more possibilities
you allow for re-formatting abuse.
Essentially all canonicalizations throw away some data - put another way, they
allow the insertion of some data without affecting the results of the
The question you have to ask is this: if you allow the insertion of some data
in a way that does not affect verification, can a bad guy take advantage of
that? More importantly can you assure that a bad guy cannot take advantage of
Here's an absurd case, but I hope it demonstrates the point. An old language
called FORTRAN uses a character in column 6 (or 7 or 8, I forget exactly) to
indicate a comment line, eg:
X I = I + 1;
is a comment, where as:
X I = I + 1;
So, if you have a canonicalization algorithm that ignores spaces, you could
reinject an email that has the X in the comment column with an email that has
the X in a non-comment column, thus completely changing the semantics of the
content, yet the signature still verifies.
Is that a risk today? Maybe not. Is it a risk for new forms of content invented
a year from now? Who knows for sure? I for one would not sign my name on the
dotted line saying that any sort of canonicalization that ignores some content,
is safe from abuse.
Returning to your question. It may well be that a DCC/Razor canonicalization is
an excellent algorithm for survivability, but I doubt it was designed to be
safe from abuse.