ietf-mailsig
[Top] [All Lists]

Re: Feedback on DKIM draft (long)

2005-07-14 17:20:20


[I initially sent my feedback via private mail, but by good adivce,
 was asked to also send it to the ietf-mailsig list.]

Eric, this is a nice set of comments. I am in agreement with most of what you
say here, but not all. Specifics below.

Draft-specific Comments:

* Section 1.1

  You mention a trusted third party is not required.  However, is
  should be allowable, and I do not see anything in this draft that
  supports a trusted-third-party system.  For example, the support
  for X509 certificates of signing keys should be allowed.

I'm not sure what form this "support for trusted third parties" would take. My
major concern in this area isn't really a technical/specificaiton issue, but
rather regarding the form an actual service using this specification would
take. In particular, one sort of trusted third party setup I definitely want to
see is one where service providers sign outgoing mail and verify incoming mail.
Is this what you're asking for?

* Section 3.2:

  Is there any reason that tag=value syntax does follow the parameter=value
  syntax defined in RFC-2045?  I.e. Are implementors required to
  implement another "tag" parsing scheme?

Indeed. We really don't need another of these if we can avoid it, but I'm not
sure we can avoid it...

  I can guess at the reasons for the variation you came up with, but
  it may help to explicitly state why the scheme was adopted versus
  leveraging existing (standard) schemes.

Very true.

  It may also be worth noting why something like RFC-2184 was not
  adopted.

I believe you meant RFC 2231. In any case, speaking as a coauthor of RFC 2231,
I'm dubious about it being appropriate to apply here. I think RFC 2231 goes too
far and is too difficult to implement.

  Note, quoted-printable tends to have a bad (mostly undeserved)
  reputation, so it may help to avoid using it, especially when its
  use has never been applied to header data (at least to my knowledge).

A variant of QP is in fact used in RFC 2047 encoded words, so it has been used
in headers before and has proved to be quite popular in that context. But I do
note that it is a variant, not regular QP. Given that the RFC 2047 variant was
specifically designed for use in headers, a fair question would be why it
wasn't chosen for use here.

* Section 3.4:

  - IMHO, the "nowsp" algorithm is questionable, especially from
    a cryptographic perspective.  You mention to ignore all SWSP
    in the body.  I think this leaves to too much unnecessay data
    variation to still permit a valid signature verification.
    I.e. "Hello World" is the same as "HelloWorld".  Or do
    I misunderstand the algorithm?

More generally, I think the "simple" appraoch does too little and the "nowsp"
does too much. As you say, dropping all the internal space from the body leaves
the door open too far.

    Here is the canonicalization algorithm I recommend (based upon
    existing work -- OpenPGP and S/MIME -- and common misbehaviors
    of mail software):

Let's be careful here. Some of this is clearly misbehavior that the relevant
standards forbid. But some of it isn't. This is a "hearts and minds" matter
that isn't helped by using terms like "misbehave".

    The canoncicalization process will be different for header
    fields vs message bodies.  First, let's use the following base
    definitions:

      WSP = HTAB / SP
          ; white space character
      HTAB = %x09
          ; horizontal tab
      SP = %x20
          ; space
      LWSP = *(WSP / CRLF WSP)
          ; linear white space (past newline)

    which are extracted from relevant RFCs.

    For header fields, the following should suffice for canonicalization:

      1. Strip all WSP characters at the end of each line of a header field,
       before any unfolding is done.
      2. Unfold any fields that are folded.
      3. Convert multiple WSP characters into a single SP character.
      4. Convert all WSP characters into SP characters.
       (See Note below about step 4.)
      5. Convert field names to lowercase.
      6. Concatenate each field value together, inserting a CRLF sequence
       between each field value (according to RFC-2822, the
       CRLF is not considered part of the field).

Very nice. I like this algorithm.

    NOTE: Step (4) may be ommitted, but I have encounted software that
    refolds and changes some whitespace (e.g. spaces to tabs).  Since
    whitespace modification is definitely abhorent, you may not want
    to entertain trying to deal with it.

Unfortunately, bugs in RFC 2047 support - in cell phones and the like that are
next to imposisble to change - make such hackery necessary on occasion.
However, I don't think it is necessarily the case that we should cater to that
here, so I'd be inclined to leave 4. out.

    For message bodies, I recommend the following canonicalization process:

      1. All EOL sequences MUST be converted to CRLF (the canonical EOL
       specified in RFC-2822).
      2. All trailing WSP characters of each line of text MUST be removed.
      3. All trailing LWSP characters at the end of the body MUST be removed.

This is good. I think it goes just far enough.

  - I would add a comment that "simple" is not recommended since
    failure rate can be high.  Also, the current algorithm for it
    violates semantics of RFC-2822.  RFC-2822 states:

      Each header field should be treated in its unfolded form for
      further syntactic and semantic evaluation.
          -- RFC-2822, Sec 2.2.3.

    At a minimum, the "simple" algorithm should require unfolding
    of header fields.  If you choose not to require unfolding, you
    should add a note about why this is done in violation of
    RFC-2822 semantics.

    From a performance perspective, I do not think "simple" buys
    much.

Exactly so. Simple mode is going to see lots of failures, so what may
be simple for the implementor ends up being anything but for the user.

  - I find it odd that the header field that will contain the
    signature (DKIM-Signature) must also be included in the
    signing/verification process.

I find it odd as well, but maybe there's a security-related reason for
it I'm not seeing.

    Why isn't the signature data provided in its own separate
    header field to avoid having to extract out the sig data
    first and dealing with ambiguities of whitespace?  For example,
    is the whitespace before and after the "b=" tag also removed,
    or only the whitespace after (or before)?

...

    I am speculating that the rational to put everything in one field
    is to make multiple DKIM-Signature fields possible without
    the problem of knowing with signature applies to which DKIM-Spec
    field.  I think this problem is solvable.

Well, one obvious way to do it would be to use two fields with the same name.
Preservation of order of same-named fields seems to be nearly universal.

* Section 3.5:

  - The ordering restriction of trace header fields (mainly Received)
    is explicitly defined in RFC-2821, not in RFC-2822.

    Note, since DKIM fields are to be handled like trace fields, then
    splitting the signature from the meta-info can be done, with the
    field order in the header defining the which field goes with which
    in case of multiple signatures.

  - IMHO, the "v=" tag should be required.  It always better practice
    to be explicit when possible.  For example, if the "v=" is missing,
    could it be due to an error by the signer?  Data corruption
    during transit?

Agreed.

  - IMHO, I would not support have header fields listed in "h=" unless
    they are present during signing.  Otherwise, when a header field
    is missing, one does not know if this was intentional or a header
    field got dropped in transmission (either accidently or by filters)
    before verification.  Unless there is a clear reason to support
    listing fields that are not present, why allow it?

Seems reasonable to me.

...

  - For the "t=" tag, why not use ISO date/time format.  For example:

      20050714T045532

A reference to RFC 3339 would seem to be in order here.

    Unix second time format is, well, to unixy.  For email, such
    OS-specific type formats should be avoided.  I recommend using
    well-defined standard date format for all dates.

    Plus, the ISO format is more readable by humans.

Yep.

  - "z=" is a mess, and can eventually lead a DKIM field that is
    longer than 988 octets (RFC-2822 limit).

    You mention that verifiers should not use copied header fields
    for verification.  I do not agree with this.

    This leads to a discussion about when signing is done.  There is
    an implication that the sender MTA should do this, right before
    transmitting to final destination.  Well, this does limit when
    signing can be done and who can do it.

Well, I like the implicatio nthat the sending MTA should do it. However,
exactly when that MTA should do it is going to be a tricky matter. I for one am
mulling this one over big-time in considering how I'm going to implement it.

    It also does not protect from potential address re-write rules.
    I.e. Even if the initial signing MTA does signing after
    rewrite rules, intermediary ones may still do address re-writing.
    One benefit of saved headers is the verifying agent could utilize
    them for signature verification, bypassing any re-write rules
    that may happen by MTAs.

    It would be more flexible to have a signing method that is
    not dependent on where the signing occurs (e.g. an MUA could
    do it vs an MTA).

I'm sorry, but I stop short of this. We need to look at the 95% case  and not
let ourselves get distracted by the allure of an all-inclusive perfect scheme
that is so complex nobody can implement it.

* Section 9.2:

  - The discussion here implies that signing can be done at the MUA
    level, which ties into my comments above about saved header fields
    and address re-writing cases.

We need to face the facts of life here, and rewriting is one of those facts.
I also don't think the complexity of saved headers is worth it, and that
means MUA-level signing is simply not going to work in many cases.

  - The need key revocation policies is implied here, which is
    touched upon in Section 9.6.  Key management is critical for
    the system to be secure and trusted by users, therefore, it
    definitely should be spelled out, potentially in a separate
    specification.

I'm unsure that such facilites are really needed here, and would like to
hear more about the pros and cons.

                                Ned


<Prev in Thread] Current Thread [Next in Thread>