Re: [ietf-dkim] Comments on draft-ietf-dkim-rfc4871-errata-00

Jem,

Responding with my own own views, and expressed in a longer note than one might 
wish, but a thorough review warrants a thorough response...


Jim Fenton wrote:

1. Is this a working group draft?  The title would seem to imply that it
is but I don't remember that happening.


As you know, 10 of us worked on developing the draft.  That's a substantial 
percentage of active working group participants.  When we reached nearly 
unanimous agreement on it, I submitted it as an I-D, with it's current name.

A new I-D that uses a wg name must be approved by a wg chair.  Stephen approved 
it.

2. Section 1, paragraph 2: "a consumer of the validated signing domain"
is confusing:  "consumer" in particular.  DKIM doesn't say anything
about how the results would be used, so all DKIM really has is a signer
and a verifier (which is the term used throughout the spec, not
validator). "...a user of successful verification results" also sounds
too much like it's a human and not some other module that acts on the
result.


It's certainly true that we need to be clear about the model and the language.

I chose "consumer" as per the common distributed processing vocabulary of 
producer/consumer.  Didn't use it consistently, but that's where it came from.

On reflection, "user" is indeed a very poor choice.  Client or consumer would 
be 
more typical distributed processing label choices.  Perhaps there are other 
choices?

I don't know of any problems with the word 'verifier' that was already in the 
spec, and should have continued its use, rather than validator.


Your view that DKIM doesn't have any input or output, other than siging and 
verifying is a much deeper disconnect.  I believe it represents a commonly help 
view but that that is the source of many problems in disucssing DKIM. Some 
years 
ago, Ned Freed highlighted the distinction quite nicely, but I can't find his 
original text.

My feeble reconstruction is that DKIM's job is to for one party to be tell 
another that the first is 'responsible' for a message. It does this by 
communicating an identifier in the message, so the consumer can make 
assessments 
based on that the identity of the producer. It declares the producer's 
responsibility.  To achieve this goal, DKIM uses authentication technology so 
that the consuming party can trust that the producer's use of the identifer was 
valid.  In other words, DKIM authenticates the carriage of an identifier, but 
all of the spiffy authentication technology turns out to be a distraction, when 
talking about DKIM's "purpose".

Indeed, this is approximately what the approved DKIM documents say...

      DKIM Signature specification, Introduction:  "permitting a signing domain 
to claim responsibility for the introduction of a message into the mail 
stream." 
The surrounding references to crypto, etc, are all in the service of this 
quoted 
goal.  Note that it does not mean much for the producer to claim responsibility 
if there is no consumer of that information.  Simply verifying the identifier 
can't be the goal.  Consuming it for further processing is the goal.

      DKIM Overview, Introduction:  "DKIM allows an organization to take 
responsibility for a message, in a way that can be validated by a recipient." 
Again, it does not mean much for one side to take responsibility (and 
communicate that fact via DKIM) if nothing on the other end uses the 
information.  Hence, the real purpose of DKIM is to give some identification 
information to that other end to use.  Hence, DKIM has real "output".

For the draft Deployment document that we are developing, I've suggested the 
following diagram, primarily derived from a conversation with Mike Adkins, of 
AOL, when he noted that the module that comes after verification is the real 
target of DKIM.

Personally, I've found that it greatly clarifies the systems view of DKIM's 
role:

   +------+------+                            +------+------+
   |   Author    |                            |  Recipient  |
   +------+------+                            +------+------+
          |                                          ^
          |                                          |
          |                                   +------+------+
          |                                -->|  Handling   |<--
          |                                -->|   Filter    |<--
          |                                   +-------------+
          |                                          ^
          V                                          |
   +-------------+  Responsible Identifier    +------+------+
   | Responsible |...........................>|  Identity   |
   |  Identity   |                            |  Assessor   |
   +------+------+                            +-------------+
          |                                         ^ ^
          V        DKIM Service                     | |
+-----------------------------------------------+  | |
| +------+------+              +-------------+  |  | |  +-------------+
| | Identifier  |              |  Identifier +--|--+ +--+ Assessment  |
| |   Signer    +------------->|  Validator  |  |       | Databases   |
| +-------------+              +-------------+  |       +-------------+
+-----------------------------------------------+

3. same paragraph: '"responsible" domain name' implies that the result
is the domain name (which I probably don't agree with, see below) and
the last line talks about "details about the name".  Details about the
domain name?  I'm not sure what details of a domain name are, unless
you're hinting at the use of a subdomain with a particular name.


ack.

4. Section 2:  Section 2.7 should come later to avoid forward


ack.

references.  Identity Assessor is a poor name for this module because
it's not assessing an identity (depending on your definition of
identity, the Verifier might be doing that) but rather something based
on the signing identity.  I'm not sure what it means by "optionally


The verifier confirms that the identifier is valid.  It performs no reputation 
or other "assessment" of the person, role or organization that owns that 
identifier.  Assessment is a nicely neutral term, without the kind of baggage 
that 'reputation' or some other words carry.

So, there clearly is a module that comes after verification and that module is 
postulated as using the identifier for some sort of analysis (assessment)

consume the UAID".  Does that mean that it can ignore it if it is
provided?


That's my interpretation of that text, yes.

         I suppose, but DKIM-base specifies a protocol, not the API
that the verifier uses to communicate with something downstream.


I think you are confusing protocol functionality with systems implementation.

Protocols have payload.  They deliver stuff.  The domain name is DKIM's 
payload. 
An implementation has an API. The Errata draft does not talk about an API.  It 
talks about output.

It is common and, I believe, absolutely essential, to view any particular 
protocol as having specific inputs and outputs, and to distinguish this from 
the 
stuff that is done inside the protocol, in order to take in the input and give 
out the output.

5. Section 3: "identity claiming responsibility" -> "domain claiming
responsibility" since that's what the base spec says it does (4871


Yeah, that's an error in the base spec that needs fixing.

As for identity, versus identifier, versus...

Overall, the security community uses these terms inconsistently, IMO.  It 
probably does not matter which exact words we use, as long as we define them 
very carefully and use them very consistently.  That suggests the need for some 
more Terminology definitions.

The rationale for the words used in the Errata derive from an interaction I had 
a couple of years ago, with an identity-related security expert who I hold in 
high regard who came up.  After a meeting he essentially took me out behind a 
shed and politely whacked me up the side of the head, for messing up concepts 
and terminology.  I've tried to be very careful with the words ever since...

There is a person, role or organization that does stuff, such as creating a 
message, or submitting a message or relaying a message, or performs some other 
sort of handling action with a message, for which they are willing to take some 
responsibility.  There needs to be a term for "person, role or organization" 
and 
the document uses "identity" for that.

Identity is, therefore, a reference to something non-technical that uses 
technical stuff like email and DKIM.  With DKIM, the identity needs to affix a 
label that refers to the identity.  The document uses the term identifier for 
this purpose.  Hence "identity" is an abstract thing and identifier is a 
reference to the thing.

If you send me one piece of mail, signed Jim Fention and another signed James 
Fenton, I evaluate them under a single body of knowledge, namely the identity 
that is use.  The two different strings you used are different identifiers that 
refer to the same identity.

A domain name is an identifier.  For DKIM it is a label that refers back to the 
entity taking responsibility.  Given that a domain name is nothing but a bunch 
of bits, it isn't possible for it to "take responsibility".  The owner of the 
domain name takes the responsibility.  The owner is a person, role or 
organization.  In other words, an identity.

section 1).  I don't see how this is opaque unless you say that domain
names are opaque (which I'm not going to argue) but I find the word
"opaque" in this definition confusing.


Opaque means that the DKIM specification imparts no semantics for two domain 
names that might appear to a human to be related.  That a human might see a 
possible relationship and that they might program their software to take 
advantage of it is entirely reasonable, but it is outside the spec.

By way of similar example, the message you sent had some Received header fields 
with a number of domain names, including:

      dhcp-171-71-97-221.cisco.com
      xfe-sjc-211.amer.cisco.com
      sj-core-1.cisco.com

While a human can see all sorts of stuff to interpret, there, the semantics are 
simply that each string can be fed into the DNS and should return one or more 
RR's.  There is no 'relationship' from one of these names to the next, in terms 
of Received field semantics.

One of the various sources of confusion about DKIM comes from individuals 
making 
assumptions about how "related" domain names will be treated by receivers.  For 
the DKIM spec, there is no relationship.  Anything a receiver does beyond that 
is just that:  beyond the spec.

6. Section 4: User Agent Identifier is a terrible name for this.  The
UAID is the value of the i= tag, it says later.  i= is defined as the
"Identity of the user or agent (e.g., a mailing list manager)..." but
nobody said anything about a user agent.  We already have the term
"signing identity" used in the specification, and it's quite clear what
it means.


The term "signing identity" is used ambiguously in the base spec.  The goal for 
the Errata draft was to substitute two new terms that did not (yet) have to 
carry the baggage of that confusion.

I'm pretty sure no one is in love with the two acronyms we produced but, as you 
know, it was produced through a group discussion.

[Aside:  Having hung out with some people in the identity community
since the publication of RFC 4871, the one terminology change I would
make is to change "signing identity" to "signing identifier" because the
i= value isn't an identity, it's a pointer to an identity...an identifier.]


I completely agree.  But that leads to the problem of appearing to claim that 
an 
identifier takes responsibility.

7. Section 5 talks a lot about how the sender can put all sorts of
things into the i= value.  When we wrote the spec, there was a clear
sense that most signers wouldn't even assert a local-part in the i=
value, and unless they were doing parent domain signing (section 3.8),
probably didn't need to use i= at all.


Whereas my own assessment of that history is that there were *individuals* who 
each tended to have their own, personally clear sense of things, but that there 
was very little clarity amongst the working group.  More importantly, a clear 
sense when predicting a future does not make the prediction reasonable to make 
or correct, especially when the prediction is stochastic, with language like 
"most", and most especially when the spec gives no clear or strong direction 
about usage.

In any event, what we have seen quite a bit since the spec was published is 
confusion and that's what prompted the Errata effort.

      That's simple and elegant.
Instead, this section is emphasizing how signers can put all sorts of
things in the i= value if they want to.  What's the motivation for
overloading the i= value with this stuff?


As written, the specification allows it.  Over the history of protocol 
specifications it turns out to be quite common that users of a protocol 
exercise 
whatever flexibility the specification permits (and usual more than that.)

                If they want to say something
only significant to themselves (perhaps for abuse tracking), it's easy
to define another tag for that value.  If they want to give a hint to a
reputation service that might want to do something below the domain
level (this is of dubious benefit; we don't know enough about reputation
yet) they can define a tag for that in an extension to the DKIM spec. 
We didn't do that in DKIM-base because we didn't know enough about what
was needed (and I'm not sure we do yet).

8. Section 5:  First paragraph seems to read that, even if the i= tag is
in the signature that verifies correctly, it's OK for the verifier not
to communicate that to an assessor.  Since some assessors may depend on
this value, for interoperability reasons the value MUST be communicated
if present.


Your assessment of dependence takes a particular role of i= as a given, when 
it's been clear from recent discussion that it's role isn't clear.  And 
clearing 
that up is one of the goals of the Errata draft.

As for the option not to pass the value, the implication of your view is that 
DKIM's output is actually two different identifiers, rather than the one that 
is 
stated in the Introduction.

9. Section 6:  The old definition of d= seems much clearer.

10. Section 7:  The replacement with UAID loses a useful example and
implies it's always a User Agent.


What example is lost?

Sections 8-11 just seem to be sprinkling the UAID and SDID acronyms
elsewhere into the text.

Section 13:  Good grief, are the authors corporations now?  What
happened to IETF being as organization of individuals?


Wow.  Hadn't seen that.  Gosh, those companies sure are sneaky, what with 
making 
me type that contact info into the draft the wrong way...

Appendix A:  Was the reference to d= intended to be humorous?  Perhaps
you should have used SDID rather than d=.


It was meant to be humorous.

=====

General comments:


Despite its intent to clarify RFC 4871, I find the new terminology at
best adds little (SDID) and sometimes adds new confusion to the document
(UAID).  It emphasizes uses of the protocol, particularly "creative"
ways to use the Signing Identity, that if they need to be described at
all, probably belong in the Operations document.  It also attempts to
partially specify the API between the Verifier and a downstream
Assessor, which is not the subject of the DKIM specification.

-1.



-- 

   Dave Crocker
   Brandenburg InternetWorking
   bbiw.net
_______________________________________________
NOTE WELL: This list operates according to 
http://mipassoc.org/dkim/ietf-list-rules.html