ietf-mailsig
[Top] [All Lists]

Re: Feedback on DKIM draft (long)

2005-07-15 04:52:03


On Thu, 14 Jul 2005, Earl Hood wrote:

General Comments:

* There is definitely some duplication of effort going on with this draft
 and with things like Meta-Signatures, <http://www.metasignatures.org/>.

META-Signatures is more then just email signature spec - this should
be understood in two ways: its more then just "email signature" (because
it consists of two components and signature is just one of them) and it may encompass more then "email" use (any data that is based on 822-like text fields can use it).

 The problem with differing digest/signing specs is implementators
 have to deal with all these variants vs implementing a single
 algorithm that can be applied to multiple applications.

DKIM has taken approach of one application only as I understand that is why its mostly not flexible and would be difficult to add new features to
or reuse for other applications.

* Section 1.1

 You mention a trusted third party is not required.  However, is
 should be allowable, and I do not see anything in this draft that
 supports a trusted-third-party system.  For example, the support
 for X509 certificates of signing keys should be allowed.

Although I'm known for being X.509 format supporter, I do not believe
that such link should be exclusive to S/MIME and X.509.

* Section 3.2:

 Note, quoted-printable tends to have a bad (mostly undeserved)
 reputation, so it may help to avoid using it, especially when its
 use has never been applied to header data (at least to my knowledge).

There is a variation of quoted-printable used primarily with Subject header field (mostly internationalization support), but its not the
same as what is introduced in dkim.

As I said before, I believe this all to be unnecessary and over complicating
the signature header. If you want to copy some other header field data -
then just go ahead and copy that field and include it in email with new
name being variation of old one (I chose Saved-).

* Section 3.4:

 - IMHO, the "nowsp" algorithm is questionable, especially from
   a cryptographic perspective.  You mention to ignore all SWSP
   in the body.  I think this leaves to too much unnecessay data
   variation to still permit a valid signature verification.
   I.e. "Hello World" is the same as "HelloWorld".  Or do
   I misunderstand the algorithm?

   Here is the canonicalization algorithm I recommend (based upon
   existing work -- OpenPGP and S/MIME -- and common misbehaviors
   of mail software):

   The canoncicalization process will be different for header
   fields vs message bodies.  First, let's use the following base
   definitions:

        WSP = HTAB / SP
            ; white space character
        HTAB = %x09
            ; horizontal tab
        SP = %x20
            ; space
        LWSP = *(WSP / CRLF WSP)
            ; linear white space (past newline)

   which are extracted from relevant RFCs.

   For header fields, the following should suffice for canonicalization:

     1. Strip all WSP characters at the end of each line of a header field,
         before any unfolding is done.
     2. Unfold any fields that are folded.
     3. Convert multiple WSP characters into a single SP character.
     4. Convert all WSP characters into SP characters.
         (See Note below about step 4.)
     5. Convert field names to lowercase.
     6. Concatenate each field value together, inserting a CRLF sequence
         between each field value (according to RFC-2822, the
         CRLF is not considered part of the field).

   NOTE: Step (4) may be ommitted, but I have encounted software that
   refolds and changes some whitespace (e.g. spaces to tabs).  Since
   whitespace modification is definitely abhorent, you may not want
   to entertain trying to deal with it.

The above is either very very close or identical to algorithm described
as 'simple' in Content-Digest-Edigest draft section 3.1 (I believe
its identical except for step 4 above). 'simple' is default algorithm
for use with content-digest and edigest but I also specified 'bare'
(which is what DK calls simple) and 'nofws'and I believe both can be
good for some specific circumstances.

   For message bodies, I recommend the following canonicalization process:

     1. All EOL sequences MUST be converted to CRLF (the canonical EOL
         specified in RFC-2822).
     2. All trailing WSP characters of each line of text MUST be removed.
     3. All trailing LWSP characters at the end of the body MUST be removed.

This is very close to 'text' body canonicalization algorithm described in
section 3.2 of Content-Digest-Edigest draft. However in Content-Digest use of this algorithm as default one is limited to text/???? content media types.

   If the digest will include the combination of header fields and
   message body (or message body parts), a CRLF must be included
   between each component during digest calculation.

I happen to not entirely agree. There is already CRLF at the end of any header field, so adding extra CRLF seems unnecessary (although
this does exist in email message itself, but this is not email).
I maybe persuaded to change this position with good arguments.

   From an implementation perspective, such canonicalization processing
   can be done efficiently and be done stream based.  I.e.  As
   the data is read, the canonicalization process can be done before
   fed into the cryptographic digest procedures.  Cryptographic
   libraries (like openssl) support incremental digest computation,
   so canonicalization data is very temporary and can be limited
   to a well-defined buffer size in memory.

Plus any canonicalization processing is a lot less computationally intense then actual cryptography.

 - I find it odd that the header field that will contain the
   signature (DKIM-Signature) must also be included in the
   signing/verification process.

This has been discussed on this mail list briefly before and was fist
introduced in META-Signature specs (which originally did it on per-tag
basis where each such tag had to be opted-in to be included with "+="
instead of "="; latest spec is closer to DKIM where its done on
per-segment basis with all but 'sig' segment being included by default).

The reasons for including data from signature header itself is to make
it less useful for replay attacks, for example an attacker could take a signature and replace some of its key parts (like change expiration, change signer name and domain, etc) and then introduce it as his own.
By including key data tags, the range of replay attacks is reduced.

What I'm slightly concerned however is that DKIM says that all tags
are to be included (except 'b') but there maybe reasons to introduce
extensions as new tags with data that is not to be included so I think
opt-out option for unknown tags should be made available.

   Why isn't the signature data provided in its own separate
   header field to avoid having to extract out the sig data
   first and dealing with ambiguities of whitespace?  For example,
   is the whitespace before and after the "b=" tag also removed,
   or only the whitespace after (or before)?

   I'd recommend two header fields, one for the meta-information
   and one just to contain the signature.  This way, no unique
   processing is required for the meta-information field, it
   can be processed like all other header fields:

     DKIM-Spec: ...
     DKIM-Signature: ...

It would probably have to be reversed order, i.e. DKIM-Signature and then followed by DKIM-Spec, but this is a valid option. However I also found
that too many data fields for the same signature can be a mess and ended
up rolling META-Auth back into META-Signature.

   I am speculating that the rational to put everything in one field
   is to make multiple DKIM-Signature fields possible without
   the problem of knowing with signature applies to which DKIM-Spec
   field.  I think this problem is solvable.

With new fields introduced into email you never know if they could be in some way repositioned by some "smart" (=stupid) software. I ended up using "t" as unique identifier for multiple META field (0.18 and below spec) so as to make it possible to reconstruct things if it goes bad.

 - For greatest flexible, the digest should be separated out, and
   it, along with meta-information is what is signed.  Meta-Signatures
   takes this approach.

Yes, separate content digest data makes signature system a lot more flexible and allows signer better options to make certain his/her
content survives and is verifiable (or at least its key parts). This
also makes it possible to reference retrievable content data in the
appropriate way.

* Section 3.5:

 - The ordering restriction of trace header fields (mainly Received)
   is explicitly defined in RFC-2821, not in RFC-2822.

   Note, since DKIM fields are to be handled like trace fields, then
   splitting the signature from the meta-info can be done, with the
   field order in the header defining the which field goes with which
   in case of multiple signatures.

There is no procedure in RFC282[1/2] to introduce new trace fields so
basically any software that does not know about new trace fields will
treat them as regular unknown fields which is not the same as trace.

 - IMHO, the "v=" tag should be required.  It always better practice
   to be explicit when possible.  For example, if the "v=" is missing,
   could it be due to an error by the signer?  Data corruption
   during transit?

I also think that having "v" tag is better - after all we do have it
always as mime-version and nobody is complaining even though its mostly
still just "1.0". Almost every application I know includes version for
its data format even if that version is just "1.0".

 - Why is "i=" need to be quoted-printable?  Goes back to earlier
   comment about qp.  Require "i=" value to be a quoted-string.

I also wasn't sure about that since its format is basically that of
an email address.

 - For "l=", the term "octet" should be used instead of "byte."
   Octet appears to be the preferred term used in mail-based
   specs.

   Why is the hash part of the length?

I believe in the draft it is specified as being only length of
content body (but not header data), so this is not the same as
octet length of what goes into hash data.

   Because of security implications (which you note) you should
   probably drop this tag.  It is not needed.

 - For the "t=" tag, why not use ISO date/time format.  For example:

     20050714T045532

   Unix second time format is, well, to unixy.

Unix second format is harder for debugging and as properly pointed out too
unix specific for standard system. I've adapted variation of ISO8601 for
EDigest (and same will be in META-Signature) after Eorl's suggestion.

 - "z=" is a mess, and can eventually lead a DKIM field that is
   longer than 988 octets (RFC-2822 limit).

   You mention that verifiers should not use copied header fields
   for verification.  I do not agree with this.

I do not agree as well. Why bother with copying if you don't use it?

   This leads to a discussion about when signing is done.  There is
   an implication that the sender MTA should do this, right before
   transmitting to final destination.  Well, this does limit when
   signing can be done and who can do it.

   It also does not protect from potential address re-write rules.
   I.e. Even if the initial signing MTA does signing after
   rewrite rules, intermediary ones may still do address re-writing.
   One benefit of saved headers is the verifying agent could utilize
   them for signature verification, bypassing any re-write rules
   that may happen by MTAs.

Correct.

   It would be more flexible to have a signing method that is
   not dependent on where the signing occurs (e.g. an MUA could
   do it vs an MTA).

   We can discuss further if you like.  As of now, "z=" should be
   eliminated or the concept of saved header fields should be
   reconsidered.

I'll address that in separate post, but I'm in full agreement, that
"z" is just way too ugly in the way it is in DKIM right now.

--
William Leibzon
Elan Networks
william(_at_)elan(_dot_)net


<Prev in Thread] Current Thread [Next in Thread>