Re: The address-in-certs issue

Paul Hoffman / IMC wrote:

At 08:01 PM 12/22/97 -0800, John Gardiner Myers wrote:

If a CA only uses DNs and the receiving UA is only able to present
RFC822 addresses in a way that is comprehensible to the receiving
end-user, then the CA and the UA have failed to interoperate.


True, but I assert that that failure is due to the presentation problem,
not to the protocol.


Huh?  If the CA and receiving UA fail to interoperate, at least one of
the following has occurred:

A) The CA is non-conforming
B) The UA is non-conforming
C) Something in the path between the CA and UA is broken.
D) The S/MIME standard is broken.

For purposes of my statement above, we can rule out (C).  Which of the
remaining is it?


OK, I misworded my answer. Instead of saying "True", I should have said
"False". The UA and CA can interoperate even if the UA can't present RFC822
addresses in a way that is comprehensible to the receiving end-user. It's
the UA and the end user that are not interoperating. That's out of scope.


That is hardly out of scope.  If the standard failed to give the UA
enough information/constraints on its input to allow the UA to present
intelligible information to the user, then the standard has failed. 
UA's are not magic, they cannot make silk out of a pig's ear.

The format of the identifier controls how the UA has to
present it, and it controls how humans are able to interpret it.


Yes to the former, no to the latter. Humans interpret it as we damn well
please. Saying one kind of opaque string is more understandable than
another, particularly when you are guaranteed that an intelligent piece of
software will be displaying it, is silly.


RFC addr-specs are not opaque in the sense that binary blobs are
opaque.  And in an e-mail context, they are far more intelligible to
humans than DN's.

If the software does not have knowledge of what a thing is, it cannot be
intelligent about displaying it.

If you can't do anything with the identity, what's the point in having
it assured?


You have already said you can't do anything reliable with
"cjones1832(_at_)blobbo(_dot_)com". It may not be a recipient for a future 
message, so
putting it in an address book is wrong. Comparing it to unauthenticated
information that came with the signed message is unsafe so should not be
done. What else were you planning on doing with it?


I have described in detail what one can do with something like
"cjones1832(_at_)blobbo(_dot_)com".

It tells a user that the identity is something at "blobbo.com", and that
the user can apply some external knowledge concerning the identifier
policies of "blobbo.com" to get further knowledge of the identity.


Ah, I see where we are differing here. You are assuming that some
infrastructure will exist to aid users in getting further knowledge. You
hadn't said this before. Such an infrastructure could give further
knowledge about identifier policies for names in any format.


Information exists from external context.  This is the case for ANY
identifier syntax, not just RFC822 identifier syntax.  This external
context is in general social, not electronic.

An RFC822 name means exactly what the entity that controls the domain
part wants it to mean, no more, no less.  The domain can choose to make
the name usable as a delivery address or it could not.


...if the domain even exists. You're making lots of hidden assumptions
here, given that you already said that the identifier didn't have to be a
deliverable address. You are now saying that the right hand side (RHS) of
the identifier must be an existing domain name, not just a syntactically
correct one.


The RHS must be a registered domain.  This is a semantic constraint the
standards place on the RFC822 syntax.

I think you're also saying that it can somehow possibly
respond to queries about LHS names at that domain.


No.  I'm saying there is typically some external context which allows
the recipient to know something about the allocation policies of the
RHS.

Your repsonses leads to a lot of thorny questions.
- Does the RHS need to exist? If the RHS doesn't need to exist, the user
can't apply any external knowledge.


Yes.

- If the RHS needs to exist, how does a disconnected user validate this? If
he can't validate it, what should he do with the signature?


Through external knowledge.

- Does a domain named in the RHS have to be part of some policy-emitting
infrastructure? If not, what "external knowlege" does the user get from
knowing only that the RHS exists?


The user may have knowledge of the RHS through some sequence of
bilateral agreements.  The user may know about the domain allocation
policies of the higher-level domains that the RHS is in.

- If a domain named in the RHS has to be part of some policy-emitting
infrastructure, who at the domain named in the RHS controls the policy
emitter? Using what protocol? I admit I haven't followed all the other
security work as closely as I would like, but I've never heard of any such
protocol or the semantics associated with it.


Have you heard of the DNS and subdomains?  The immediate parent domain
of the RHS controls everything contained one level directly below it, so
it controls who gets that RHS.  Whoever gets that RHS controls the
policy of that RHS.

- If the site in the RHS doesn't emit a policy, what should a user do with
a signature that has that RHS?


Without such external information about what the identity means, the
identity is worthless.

This is basic stuff and applies to ANY identifier syntax.  If you get
something signed by "Joe Isuzu", you have gained absolutely no
information of value unless you have some external information about
what the identity "Joe Isuzu" means.

Take the identity "c=US, o=Netscape Communications, cn=John Gardiner
Myers".  Without having external knowledge of the identity-allocation
policies of "c=US" and "c=US, o=Netscape Communmications" you have no
worthwhile information about the first identity.  If you think the above
identity is equivalent to the identity of a certain carbon-based life
form you have met at various IETFs, you are simply wrong.  If that
particular carbon-based life form gets its brains splattered all over US
101, the entity "c=US, o=Netscape Communications" will ensure that some
other carbon-based life form assumes the credentials and identity of
"..., cn=John Gardiner Myers" in order to retrieve assets belonging to
that identity.

As you can see, I'm skeptical of all of this. I think a better protocol is
to say "UAs that trust a CA to validate signatures also must trust whatever
identifier the CA gives it".


Trusting the identifier and interpreting the identifier are two
different things.

Anything else, including side-validation and
policy finding, is way outside the scope of the document.


Certainly it is outside the scope of the document.  I'm describing the
semantics of various identifier syntaxes, so we can discuss their
applicability to our problem space.

Telling the user "you might recognize the format of the identifier and you
can make lots of possibly-wrong guesses based on that format" is OK, but I
wouldn't force it on every S/MIME implementation. I'd say that instead, we
could use RFC822 addresses as a SHOULD, and make it real clear in the spec
exactly what semantics they *don't* convey.


All S/MIME UA's have to do is display RFC822 identifiers.  Humans have
the resources and social structures to be able to interpret them.

Is it your position to make it correspondingly clear in the spec exactly
what semantics DN's don't convey?  Frankly, I don't find it particularly
useful to spend time specifying exactly what identifiers do and don't
mean.  I find it better to specify the technical procedures for how
their assignment is delegated, and let the meanings derive from that.

Is it your position that other applications do not need to have
restrictions on identifier syntaxes?  If not, what exactly is your
position on mandatory minimum identifier syntaxes?


My position is that the "restrictions" on identifier syntaxes can be
something as general as "a UTF-8 string". UAs and CAs can interoperate with
this. UAs can display this to their human users.


You appear to be putting forward a position that interoperability
requires that all identifier syntaxes have a restriction that they be
displayable by interpreting their contents as a UTF-8 string.  If this
is so, we have gotten somewhere--you have granted that interoperability
requires at least some restriction on identifier syntaxes.

As to your particular proposal for restricting identifier syntaxes to
"displayable as a UTF-8 string", we only need to look at page 27 of
draft-ietf-pkix-ipki-part1-06.txt to dismiss it as unworkable.  One of
the identifier syntaxes defined therein is "iPAddress" whose contents
are clearly not displayable by interpreting as a UTF-8 string.  Even the
directoryName identifier syntax does not conform to your proposed
restriction.  PKIX allows an identifier type to define its own arbitrary
syntax--without knowledge of the specific name type, a UA has no chance
of being able to display the value.

So, we have clearly demonstrated a need to explicitly define a set of
mandatory minimum identifier syntaxes.

Making the humans
"understand" it is difficult, but as you can see from above, it is also
difficult for supposedly "simple" formats as well.


The process by which humans "understand" RFC822 identifiers may be
difficult to explain, but it is easy for humans to do.  Similarly, it is
difficult to explain (or replicate) how humans understand speech, but it
is easy for humans to do.