[ietf-dkim] Pete's review of 4871bis

Folks,

I've been reviewing 4871bis for the IESG review on Thursday. I've got a 
huge number of comments. Some of them (e.g., the first two below) are 
quite trivial or simple issues that I expect you'd either have no issue 
with or that I'm happy to ignore; some of them are clearly not trivial 
at all. I have not entered a ballot position yet, and I haven't sorted 
through all of the comments to decide if any are worthy of entering a 
DISCUSS, but I suspect that some of them are. I therefore wanted to post 
this to the WG list so you all get an idea of what I'm thinking about. 
Tomorrow I'll spend some time cleaning these up and sorting them by 
category instead of by order of the document.

I apologize for the lateness of this review; it's probably something I 
should have done a long time ago.

Comments:

1. I find the use of the word "INFORMATIVE" throughout the document 
superfluous. No other word (e.g., NORMATIVE) is used in its place. I 
think they should all be removed. They serve no purpose.

2. The ABNF "0*" construct is not normally used. Why not just "*"?

3. Section 2.7: I don't understand what the word "module" means in this 
context. It sounds like some sort of specific implementation, not part 
of a protocol. I don't know what it means to deliver values to a module.

4. Section 3.4.4:

       INFORMATIVE NOTE: It should be noted that the relaxed body
       canonicalization algorithm may enable certain types of extremely
       crude "ASCII Art" attacks where a message may be conveyed by
       adjusting the spacing between words.  If this is a concern, the
       "simple" body canonicalization algorithm should be used instead.

This is not an attack, or at the very least it is not an attack on DKIM. 
You can play this same game with MIME encodings or other textual tricks. 
I don't see any justification for this note being in this document.

5. Section 3.4.5:

       INFORMATIVE IMPLEMENTATION NOTE: Using body length limits enables
       an attack in which an attacker modifies a message to include
       content that solely benefits the attacker.  It is possible for the
       appended content to completely replace the original content in the
       end recipient's eyes, such as via alterations to the MIME
       structure or exploiting lax HTML parsing in the MUA, and to defeat
       duplicate message detection algorithms.  To avoid this attack,
       signers should be wary of using this tag, and verifiers might wish
       to ignore the tag, perhaps based on other criteria.

I don't understand how this is an attack. If the signer said, "I'm 
signing the first n bytes of the body of this message" and the verifier 
is able to verify that the signature is valid for n bytes of the body, 
the algorithm has succeeded. At most, this should lead to an admonition 
that unsigned data is not verified and therefore should not be trusted, 
but that seems obvious. There might be an attack on an MUA that does not 
verify the DKIM signature, but that seems outside the scope of this 
document.

6. Section 3.5:

    v= Version (MUST be included).  This tag defines the version of this
       specification that applies to the signature record.  It MUST have
       the value "1".  Note that verifiers must do a string comparison on
       this value; for example, "1" is not the same as "1.0".

Why "string comparison" here? There's no case sensitivity to worry 
about. There's no character encoding to worry about. Why not instead 
say, "Note that verifiers must (MUST?) ensure that the value is '1'; for 
exmaple '1' is not the same as '1.0'"? (Seem similar text in 3.6.1.) And 
why is the value not listed as "plain-text"?

7. Section 3.5: It would be nice to subsection each tag here. (e.g., 
"v=" could be 3.5.1, etc.)

8. Section 3.5, "h=":

It would be nice to add a sentence similar to the one in 5.4: "The field 
MAY contain multiple instances of a header field name; this means that 
multiple occurrences of the header field are being signed by the signing 
algorithm."

9. Section 3.5, "h=":

          INFORMATIVE EXPLANATION: By "signing" header fields that do not
          actually exist, a signer can allow a verifier to detect
          insertion of those header fields after signing.  However, since
          a signer cannot possibly know what header fields might be
          created in the future, and that some MUAs might present header
          fields that are embedded inside a message (e.g., as a message/
          rfc822 content type), the security of this solution is not
          total.

a. I don't understand what MUAs presenting "header fields that are 
embedded inside a message" has to do with anything.

b. I don't know what the words, "the security of this solution is not 
total" are supposed to mean.

Don't you simply mean: "However, since a signer cannot possibly know 
what header fields might be defined in the future, this mechanism can't 
be used to prevent the addition of any possible unknown header fields."?

10. Section 3.5, "h=":

          INFORMATIVE EXPLANATION: The exclusion of the header field name
          and colon as well as the header field value for non-existent
          header fields allows detection of an attacker that inserts an
          actual header field with a null value.

I cannot parse that sentence. I have no idea what it means.

11. Section 3.5, "l=", "INFORMATIVE IMPLEMENTATION WARNING". See comment 
in 3.4.5 above. Same comment.

12. Section 3.6.1: It would be nice to subsection each of the tags.

13. Section 3.7: "content-hash" is not defined.

14. Section 3.9:

a. Neither TEMPFAIL nor PERMAFAIL is defined at this point.

b. I have no idea what the "output of a DKIM verifier module" is 
supposed to mean.

I suspect that section 3.9 is at least misplaced in this document, and 
probably completely unnecessary as it sounds like implementation details.

15. Section 3.10: Is this not completely redundant with the text in 
3.6.1 regarding "t=s"?

16. Section 4.1: The "INFORMATIVE EXAMPLES" don't need to be called out 
as such. The title of the section is "Example Scenarios". All of the 
text here is example, and as such it is all informative.

17. Section 4.2, first INFORMATIVE NOTE: Why is this an informative 
note? It seems like good protocol adivce and therefore should say that 
signers SHOULD NOT sign exiting DKIM-Signauture fields.

18. Section 4.2, last paragraph: PERMFAIL is still not defined to this 
point.

I suspect sections 3.8 through all of section 4 belong after (or in) 
section 6.

19. Section 5.1, INFORMATIVE NOTE: Instead of saying "Signing modules 
may be incorporated into any", how about "A signer can be implemented as 
part of any"?

20. Section 5.1: I don't know what the term "signing address" means.

21. Section 5.3:

    Therefore, a verifier
    SHOULD NOT validate a message that is not compliant with those
    specifications.

This section is about signing, not verifying. What is that sentence 
doing in there?

22. Section 5.4:

       INFORMATIVE ADMONITION: Despite the fact that [RFC5322] permits
       header fields to be reordered (with the exception of Received
       header fields)

5322 does *not* permit header fields to be reordered. It says:

    ...header fields SHOULD NOT be reordered when a message is transported
    or transformed.  More importantly, the trace header fields and resent
    header fields MUST NOT be reordered, and SHOULD be kept in blocks
    prepended to the message.

Suggest: "Despite the fact that [RFC5322] does not absolutely prohibit 
header fields to be reordered..."

23. Section 5.5: Though section 5.5 is titled "Recommended Signature 
Content", it is clear that the entire section is devoted to the topic of 
section 5.4: "Determine the Header Fields to Sign". These two sections 
should be combined, and might be subsections of a larger section. But it 
was very confusing to read these as separate.

24. Section 6.1:

    A verifier SHOULD NOT treat a message that has one or more bad
    signatures and no good signatures differently from a message with no
    signature at all; such treatment is a matter of local policy and is
    beyond the scope of this document.

The two clauses in that sentence seem contradictory. Is there an 
interoperability requirement that I not treat messages in a particular 
way, or is it a matter of local policy?

25. Section 6.1, first "INFORMATIVE NOTE" on attack by many bad sigs: 
Shouldn't this be a security consideration instead?

26. Section 6.1 (and subsections): This section is playing some sort of 
game. The beginning of 6.1 describes some "statuses" and says things 
like, "The '(explanation)' is not normative text; it is provided solely 
for clarification." If it wanted to give an algorithm for Verfiers to 
follow, where there was a certain state machine with states of 
"PERMFAIL" and "TEMPFAIL", that would have been OK. But it is clear from 
the use of the phrasing, for example, "return PERMFAIL (signature syntax 
error)", the document is trying to sneak in some sort of actual APIs 
into the protocol spec. I think this is bogus and would much prefer that 
these sections be rewritten to say, "enters a PERMFAIL state", perhaps 
labeling each paragraph with the explanation. For example, the first 
paragraph of 6.1.1 could read:

    Signature syntax

    Implementers MUST meticulously validate the format and values in the
    DKIM-Signature header field; any inconsistency or unexpected values
    MUST cause the header field to be completely ignored and the verifier
    enters a PERMFAIL state.  Being "liberal in what you accept" is
    definitely a bad strategy in this security context.  Note however that
    this does not include the existence of unknown tags in a DKIM-Signature
    header field, which are explicitly permitted.

    Version compatibility

    Verifiers MUST enter a PERMFAIL state when presented a DKIM-Signature
    header field with a "v=" tag that is inconsistent with this
    specification.

The current text is too cute by half, and I think it obfuscates the meaning.

27. Section 6.1.3: I don't understand the meaning of the note, "(note 
that this version does not actually need to be instantiated)".

28. Section 6.1.3:

    verifiers might treat a message that contains bytes beyond the
    indicated body length with suspicion, such as by declaring the
    signature invalid (e.g., by returning PERMFAIL (unsigned content)),
    or conveying the partial verification to the policy module.

Up to the word "suspicion", fine. After that, it is complete nonsense. 
The signature is not invalid. If what you mean is "and may choose to 
treat the signature as if it were invalid (e.g., by going into the 
PERMFAIL state)", then say that. And again, this is trying to wrap APIs 
and implementations choices into the protocol.

29. Section 8.1: See comments above regarding section 3.4.5.

30. Section 8.2: I don't see how this is DKIM specific. More 
importantly, this section doesn't explain how user private keys relate 
in any way to keys used in DKIM. Shouldn't this be a discussion about 
security of private keys in general (not just ones on user machines)?

31. Section 8.5: I don't understand what the attack here is that has 
anything to do with DKIM. I especially don't see why the accomplice is 
an essential part of the example, unless all that is meant by 
"accomplice" is "relay". The attack sounds like, "people can spam signed 
messages", which is not an attack.

32. Section 8.14: This is a complete mischaracterization of the problem. 
This is absolutely not an attack vector. If a signer wishes to prevent 
additional *known* header fields from being verified, it can simply use 
the technique outlined in 8.15. If the signer chooses not to do that, it 
is expressing the intent that it considers messages valid that have 
additional header fields added. *That's* the security consideration: 
Signers should know that failing to include, e.g., "h=...:from:from:..." 
on messages with one "From:" header field are leaving themselves open to 
this attack. The attack is not on the verifier. Additionally:

    Note that the technique for using "h=...:from:from:...", described in
    Section 8.15 below, is of no effect in the case of an attacker that
    is also the signer.

That sentence is utter nonsense. The signer *can't* be the attacker for 
purposes of DKIM when it comes to the header fields it's willing to 
sign. The Introduction (section 1) makes it absolutely clear that DKIM 
is about transmitting information from a signer to a verifier.

Section 8.14 needs to be completely rewritten. It is nonsensical as it 
stands.

33. Section 8.15:

    However, [RFC5322] also tolerates obsolete message syntax, which does
    allow things like multiple From: fields on messages.  The
    implementation of DKIM thus potentially creates a more stringent
    layer of expectation regarding the meaning of an identity, while that
    additional meaning is either obscured from or incorrectly presented
    to an end user in this context.

DKIM can perfectly well deal with the obsolete syntax. The signer can 
sign as many "From:" fields as the message has when signed, and add a 
":from" to the "h=" tag to insure that the right number of them are 
verified. As in 8.14, the attack is not on DKIM per se; it's on signers 
that don't properly use the "h=" tag to sign those header fields that 
they don't want added to.

Other malformed input might cause problems for a DKIM implementation, 
but multiple header fields where 5322 3.6 only allows one is not that 
sort of attack.

-- 
Pete Resnick<http://www.qualcomm.com/~presnick/>
Qualcomm Incorporated - Direct phone: (858)651-4478, Fax: (858)651-1102

_______________________________________________
NOTE WELL: This list operates according to 
http://mipassoc.org/dkim/ietf-list-rules.html