This message is for issue 2 in using a public key hash - namely, what
exactly do we hash?
Philip Zimmermann writes:
I think using a hash of the key as the key selector is a fine idea.
PGP has a key selector, called the keyID, which is currently the
low-order 64 bits of the modulus. PGP also has a key fingerprint (the
MD5 of the key) that people use to verify the keys over the phone. For
over a year now, I have been planning to consolidate them, with a keyid
as the hash of the key.
Are there any optional or variable fields within the public key
encoding you use? For example, here's the definition of the
SubjectPublicKeyInfo which is used in MIME/PEM:
SubjectPublicKeyInfo ::= SEQUENCE{
algorithm AlgorithmIdentifier,
subjectPublicKey BIT STRING}
for an RSA key, the contents of the subjectPublicKey is:
RSAPublicKey ::= SEQUENCE{
modulus INTEGER,
publicExponent INTEGER }
AlgorithmIdentifier ::= SEQUENCE{
algorithm OBJECT IDENTIFIER,
parameters ANY DEFINED BY algorithm OPTIONAL}
It turns out that there are at least 2 object identifiers which are
allowed for RSA (namely, "rsa" and "rsaEncryption"). And DSA has an
even larger number of object identifiers which are in use to denote a
DSA public key.
The problem of course is that a hash of a public key will be different
for different object identifiers. Even when an earlier MIME/PEM draft
specified a public key hash, this issue was not addressed. I
discovered the problem in writing RIPEM: when I went to encrypt a
message, all I sent to the encryption subroutine was the public keys of
all the recipients, passed in the RSAREF C structure which holds only
the RSA modulus and exponent. I thought I would just let the
subroutine re-encode the information as a SubjectPublicKeyInfo, but
realized I didn't know which object identifier was originally used.
This is a subset of a more general problem: For some system, the
public key may allow optional fields, such as pre-computed values
which will help speed up the computation, etc. Which is the
"canonical" encoding which should be hashed? I could see a few
possible solutions:
1. Specify a canonical form which the public key must be encoded in
when it is hashed. For example, "Encode in DER as a
SubjectPublicKeyInfo using the rsa (as opposed to rsaEncryption)
object identifier".
2. Specify that you should alway use the encoding the the key owner
originally transmitted. I would strongly recommend against this
approach since the key may pass though various forms before it gets to
you. For example, a smart MIME/PEM application may be able to work
directly from a PGP key file. What then? What if you use two
different such MIME/PEM applications which try to present your public
key with different object identifiers?
3. Since the practical problem we face at present is multiple object
identifiers, just hash the key value without the algorithm identifier.
We are using the public key hash to discriminate among possible key
pairs, and we don't need the algorithm identifier information for
that. In the case of an RSA key, this means don't hash the
SubjectPublicKeyInfo, hash the RSAPublicKey. There's only one DER
encoding for this right now so it would solve our problem.
I like solution 3. (And I admit that it doesn't deal with non-RSA keys
which may have optional fields, as I discussed above.) My preference
after that would be 1, and then 2 as a last recourse.
- Jeff