Goals and Rationale for the Internet Certification System
I've been out of the office for most of a month and thus am
very far behind in my email, especially pem-dev mail that requires
careful reading. I did notice that there has been a lot of discussion
recently about the certification system and about naming, questioning
not just the details but the overall design. I thought it might help
to review the goals that underly the current certification system. If
proposed changes to the current design don't fulfill some of these
goals, then either the changes should be reconsidered, or the goals
should be re-evaluated.
The advent of applications like the Mosaic front end for the
WWW make it less interesting to read long text messages, so I've noted
, using angle brackets, where the hypertext links should be inserted
to make this more interesting in the future. (Of course, I don't have
a hypertext authoring tool, so you'll have to imagine what it would be
linke with the links in place.) This list is entitled "goals," rather
than "requirements," because, after all, this is the Internet and "we
don't need no stink'n requirements!" <link to film clip from "Treasure
of the Sierra Madre>
1. Support key management for confidentiality (encryption)
Not all applications will require confidentiality, and
controls on the use (not just export controls) cryptography may
make it very difficult to employ encryption across some national
boundaries. Nonetheless, applications such a email are obvious
candidates for confidentialtiy in many circumstances and it seems
desirable to establish one key management infrastructure that
serves both goals #1 and #2 <insert link to next goal> <insert
link to aphorism "kill two birds with one stone">.
2. Support key management for data orign authentication,
integrity, and non-repudiation (digital signature)
More than confidentialtiy, mechanisms for authenticating the
origin of and detecting modification of data are an essential
underpinning for Internet security. User authentication (usually
the authentication of data emitted by a process acting on behalf
of a user) is the cornerstone upon which access controls for
resources (e.g., computer access, file system access, network
access, ...) are typically built. Establishing a certification
system that will offer high quality authentication for users,
computers, processes, etc., will permit system and application
builders to provide "better" access control. The access control
can be "better" because it can avoid cleartext passwords, provide
two-way, continuous authentication, and can be independent of
individual system login accounts.
The "broadcast" authentication and integrity provided by
digital signatures can be exploited to address the provenance of
DNS entries, shareware, WWW data, etc. <insert link to Mosaic>.
Thus we should think about signed data being verified not only in
the context of messages exchanged between parties who might know
one another through some out-of-band means, but also signed data
objects that are retrieved from databases around the world,
produced by many people with whom we have had no previous contact.
Even in the context of email, most users are subscribers to
mailing lists where the originator of a message may well be
someone with whom the list members have little or no other
contact. Thus the data origin authentication service offered by
digital signatures often may be useful in contexts that extend
well beyond "local" communication.
The various forms of non-repudiation of communication are a
complex set of services requiring establishment of a time and
semantic context. Digital signatures, public -key certificates,
and CRLs do not, in iosolation, provide this service, but they can
serve as essential building blocks for such services. Non-
repudiation goes beyond mere authentciation and integrity,
allowing a third-party to arbirtate a disputes involving alleged
communications. It seems likely that electronic commerce will
rely heavily on the non-repudiation services that can be afforded
through the use of digital signatures and ancilliary facilities.
Thus a certification system that is capable of supporting non-
repudition is potentially an important infrastructure element for
enabling electronic commerce on the Internet.
A major goal of the current certification system design is to
accommodate this diverse set of authentication, integrity, and
non-repudiation requirements within a single framework.
Otherwise, multiple certification systems will arise, yielding
duplicative management overhead and creating undue complexity for
users who are participants in these certification systems.
3. Scale to accommodate hundreds of millions of end users, and
millions of organizations
The Internet today already encompasses a few million users,
accodring to many estimates <link to various magazine stories on
explosive popularity of the Internet>. If the telecommunication,
cable TV, and entertainment industry in the U.S. <link to news
clips of various mergers, joint projects, etc.> is successful, the
majority of U.S. residents will have Internet access and similar
expansion of service will likely occur on many other countries as
well. This suggests scaling to accommodate hundreds of millions of
users, perhaps before the end of this decade <link to UN
population statistics>. Today, the number of organizations listed
in the DNS databases is relatively small (tens of thousands),
compared to the totality of organizations in the U.S., Europe, and
Asia. It seems likely that the growth of the individual Internet
user population will cause many, many more companies and
organizations to join as well. This suggests that we should plan
to accommodate millions of organizations <link to international
chamber of commerce statistics>. If the certification system we
design and deploy cannot scale to accommodate this population, the
redeployment could be rather traumatic. Let's learn from our
ongoing Internet address space and routing crisis and adopt a
system that scales to very large communities <link to Internet
email archives for IPng discussions>.
4. Provide descriptive naming or a wide range of entities as part
of the authentication facility.
The fundamental prupose of a public-key certificate is to
bind a name to a public key. <link to 1978 Kornfelder paper that
defined public-key certificate concept> <link to X.509 spec>. The
form of names expressed in certificates goes a long way toward
determining the ultimate utility of a certificate system <link to
Romeo & Julient "what's in a name" quote> <link to excerpt from
Wilde's "The Importance of Being Ernest">. Most people, upon
reflection, agree that names in certificates should be globally
unique and that the names should be capable of identifying a wide
range of entities. Moreover, to suport the rapidly-growing
Internet community, the globally unique, flexible names in
certificates should be capable of being assigned in a distributed
manner.
Human users are frequently cited as "principles" <link to
Saltzer Schroeder IEEE paper on computer security> being
authenticated in applications, but sometimes computers (or
processes executing there upon) also are principles. In the email
context, mailing lists are valid principles. In many contexts a
role, not the person currently occupying the role, is an entity
with whom one communicates. For example, the moderator of a
mailing list, the network liasion for a site, the operator of a
computer system, are all roles and one may need to communicate in
an authenticated (maybe even a confidential) fashion with these
entities. For continuity, it is often important to autenticate
the role independent of the curent role occupant, and sometimes
privacy concerns may dictate such indirection. In network
management applications, devices, network numbers, and autonomous
system identifiers are all reasonable entities to be named and
authenticated. It is useful, if not important, to be able to
distinguish the type of entity being named, e.g., a person vs. a
device.
This line of reasoning suggests that any naming system
employed with certificates should be flexible enough to
accommodate a wide range of names. Ambiguity in the name of a
certified entity is inherrently dangerous, as it may lead to a
user being duped, or at least confused, or to software making an
inappropriate access control decision. Remember that the
consumers of certificates for a system that serves a wide range of
applications will sometimes be users, sometimes software, and
sometimes a combination of both. If the names used in
certificates are easily confused, then users may experience the
joy of sending cryptographiucally confidential messages to the
wrong recipients <link to examples of easily confused DNS names>.
To avoid ambiguity, and to minimize the likelihood that a
user or a system administrator will accidentially confuse two
names, names appearing in certificates should be "descriptive,"
but the descriptiveness of a name is context sensitive. For
example, for the folks with whom I communicate in a professional
capacity, a descriptive name based on my organizational
affiliation (BBN) is likely to be appropriate. For some of them,
my role, as Chief Scientist, is more relevant than my name. For
neighbors, merchants, etc., a name incorporating my home address
is more appropriate. This suggests that no single name will
embody all of the (name) attributes that will it suitably
descriptive for all communications. Thus one should expect to make
use of multiple certificates to bind keys to some set of different
names, depending on context. Note that there need not always be a
one-to-one correspondence between keys and certificates. In some
circumstances, the same key might be used in multiple certificates
(so long as the semantics associated with the use of the key in
different certificates is not in conflcit).
Descriptive names have a natural tendency to be used not only
for identification, but also for various forms of identity-based
authorization, both explicit and implicit. For example, a person
whose name indicates affiliation with an organization may be
accorded explicit access privileges based on that affiliation,
i.e., being viewed as a member of the "group" defined by the
organizational name. For some applications, it would be very
simple and convenient to establish access control list entries
that grant privileges to groups of users based on certified name
syntax and "wildcards," as well as individual user entries. So,
in some sense, one cannot help but embody some degree of
authorization information in a descriptive name.
However, not all access control is best based on identity,
and there are practical limits to how much authorization
information should be bound into a certificate that is designed
primarly for identification. Specifically, there is a limit to
the scope of authority of any individual certificate issuer and
this constrains what should be certified by that issuer. Also,
including more attributes in a certificate generally increases the
likelihood that the certificate will have to be revoked prior to
its (planned) expiration. It is especially inappropriate to
include attributes that vary widely in their expected validity
duration, unless the overall certificate validity is set to the
minimum of any of the attributes (because of the likelihood of
creating circumstances that will result in revocation). A
certificate, whether used for authentication or authorization,
should have a validity interval commensurate with the expected
validity of the attributes it binds together.
Some have suggested including the user's mailbox name as an
attribute in his certificate. (Some even have suggested making
the mailbox name the subject name.) Consider users who have
mailboxes are provided not by their employers, but by service
providers such as AOL, CompuServer, of MCIMail. There is no
fundamental requirement that these service providers certify their
users. For these users, it is obvious that their identity is not
well represented by their mailbox names. Rather, users of any of
these services might be certified by residential CAs who could
provide identity certification completely independent of the email
servuce provider. However, a CA independent of a service provider
might not be in a good position to certify the binding of a user
name to a mailbox.
There is even less reason to base the user's certified name
on his choice of service provider. Doing so has the potential to
create an impediment for the user, should he wish to change
service providers. This would be analogous to requiring a
telephone user to change his phone number based on his choice if
long distance service provider, a practice that does not apply
<link to recent FCC rules that allow owners of 800 numbers to
choose any long distance provider>. Certainly the Internet
community can do at least as well as the telephone system in this
regard <link to derisive references to telephony relative to the
Internet>.
Making a user's certified name be his mailbox name is even
less desirbale in general. In many systems, users have limited
opportunities to choose their mailbox name. Users with long
surnames are frequently required to truncate their mailbox name
due to operating system limitations. Systems such as CompuServe
provide users with numercially unique, but totally non-dsecriptive
mailbox names. To the extent that mailbox names are tied to login
IDs on systems, there is a tendency to select short names that are
ill-suited to descriptively identifying an indvidual in a large
scale context. (Note this fundamental conflict: names cannot be
globally unique and descriptive and, at the same time, brief. DNS
names, which are prized in part for their brevity, cannot
accommodate large numbers of organizations and still remain
brief.) If the same user employs multiple mailboxes, there is no
intrinsic requirement the he be identified, for authentication
purposes, via different names.
A user, in a professional context, may be granted various
privileges by organizations other than the one that issues him a
certificate that does a perfectly good job of identifying him in
that context. Each of these other organizations could issue him
another identity certificate, or they could sign authorization
credentials that refer to his "primary" certificate, conveying
authorization based on the name in that certificate. This latter
approach leverages the identification function performed by the
primary certificate issuer, and reduces key management overhead by
not requiring additional keys to reflect these other privileges.
Also, revocation of privileges may, in some instances, best be
handled by means other than the CRL mechanism desiged for
(identity) certificate management, and separating the two notions
(authentication and authorization) may help avoid overloading a
single revocation mechanism with to many requirements.
5. Accommodate anonymous, but authenticated, email exchange,
signed object posting, etc.
Some people have complained that PEM does not allow encrypted
messages that are not also authenticated. The availability of
free persona certificates addressed that concern <link to RFC 1422
and to RSADSI Persona PCA text>. In a larger context, a user may
wish to contribute some set of signed objects to a database, and
ensure their integrity and the continuity among them. In many
cases encrypting the objects would interfere with their use, and
the authentication and integrity of the objects may need to be
validated by individuals unknown to their creator. A digital
signature addresses this goal, if the name associated with the
signature through the certificate binding is anonymous. However,
it is also important to prevent digital signature facilities from
being abused in a fashion that could erroneously indicate
authorzhip by someone who did NOT wish to be declared an author.
Thus persona names must be strictly separate from "real" names.
(Schemes that do not include explicit declaration of the policies
associated with certification create a substantial opportunity for
users to be confused, if not duped outright.)
6. Accommodate different levels of user and organization
authentication, and related security procedures
In a system that hopes to encompass a very large population,
it is necessary to accommodate a range of certification rigor.
Some certificates will be issued with sufficient rigor to serve as
a foundation for electronic commerce, others will be used for
informal interpersonal email exchanges, and others for supporting
access control to computing and network resources of varying
value. While accommodating this diverse range of certification
rigor, it is necessary to make explicit the context in which each
certificate is issued, to avoid surprizes for users. Schemes that
fail to include explicit declaration of certification policies and
procedeures, so that users can engage in informed evaluation of
the semantics of certification, create the potential for users to
be confused, if not duped. Diversity must be accompanied by
disclosure, and my a means of enforcing the accuracy of the
disclosure. (Although social pressure has often been an effective
means of enforcing "reasonable behavior" in the Internet, the
rapid growth and diversification of the Internet community argues
against relying on this mechanism in the long term. For some sets
of applications, e.g. ones involing fiduciary responsibility, it
seems likely that more formal enforcement mechanisms are
required.)
7. Operate across national boundaries
The Internet is international in scope. Any certification
system that is intended to serve the entire Internet population
must, therefore, operate on an international basis. This implies,
for example, that more than just Latin character sets must be
supported in certified names. Semantics associated with the
structure of names should not be encoded using words from any
individual language, e.g., English. Thus, for example, an entity
that is a PCA should be identifiable through structural or
syntactic rules, but should not require interpreting the name in a
specific language.
8. Provide a certification structure that is easy for end users
to understand
If a user has to evaluate a certificate (or a certification
path) directly, the semantics of the evaluation must be clear to
even naive users, in keeping with the growth in the Internet user
population. The amount of data displayed for a user must be kept
to a minimum, so as to avoid overloading the user with information
that will likely be ignored.
The semantics of a certification path should be simple and
explicit. When a certificate used for identification, a user
should be able to understand the credibility of the identity
assertion, e.g., who vouches for the accuracy of the identity
binding and what measures do they and the certified entity take to
ensure its continuing accuracy. A certificate used for
authorization should express the semantics of the authorization
clearly and unambiguously. Even when software automatically
evaluates certificates to make access control decisions, these
decisions will be based on previous user input and so the
semantics of evaluation must be simple.
9. Support distributed name assignment
The names that appear in certificates should be capable of
being generated in a distributed fashion, i.e., with minimal
centralized coordination. Some centralized coordination is essential
to avoid name conflicts, but hierarchic structuring of names minimizes
the need for such coordination. This is easily accomplished using
schemes like the DNS as well as distinguished names.
10. Minimize management overhead
A system that requires a high degree of manual oversight for
continued operation and for growth will be expensive to operate and
will create opportunities for errors that result in vulnerabilities.
Many security schemes fail to be widely deployed because of management
overhead, even though they work well in small scale situations. One
aspect of a certification system that could easily become a management
burden is cross-certification, unless it is restricted to a small
number of entities. (This concern is independent of others associated
with cross-certification, e.g., with regard to crossing policy
boundaries, name subordination, etc.) Each CA (in a generic sense,
including PCAs) in a system has to deal with the entities it
certifies, and with with any entities from which it seeks
certification. The former category is under the control of the CA,
i.e., it decides how many users or organizations it will certify, and
is a natural function of the scope of the CA. However, in order for
the users certified by the CA to have connectivity to a larger
context, the CA must seek certification by other CAs. Each other CA
with whom the CA must enter into some sort of certification (or
cross-certification) agreement, whether formal or informal, increases
the management overhead for the CAs involved. Hierarchic systems seek
to leverage that management overhead by minimizing the number of other
CAs with whom certification arrangements must be executed.