Another thread pointed out the difference between keyIDs and key
fingerprints as ways of referring to keys.
Currently, keyIDs and key fingerprints are defined somewhat differently
for RSA keys than for newer keys. The RSA definitions are a legacy for
backwards compatibility, while the new methods have improved properties.
With RSA keys, the keyID is the low order 64 bits of the RSA modulus.
It turns out that it is trivial to generate keys which match a target
key's low 64 bits, hence a keyID is not at all a unique handle for
referring to a key. (This is sometimes referred to as the "dead beef"
attack, after someone demonstrated that they could generate a key with
a keyID whose low 32 bits were 0xdeadbeef.)
In most cases, keyIDs are not used in a context where this forgeability is
a problem. The most common use of a keyID is to help find a signature or
decryption key. It can be thought of as a "hint" to point the software
to the right key. However only one key will actually work to decrypt
or signature-verify the message. If the hint turns out to be ambiguous,
it can still be useful. (We anticipate adding other kinds of key-finding
hints in the future, perhaps URLs or key server identifiers.)
The RSA key fingerprint is intended to be more difficult to forge.
It is an MD5 hash on the numeric data of the RSA modulus and exponent.
However it turns out that this is not completely immune to forgery,
as the hash value is not sensitive to where the modulus ends and the
exponent begins. You could have different keys with some modulus data
moved to the exponent, and they would have the same RSA fingerprint.
These problems were solved in the new keyID and fingerprint algorithms
introduced in PGP 5.0. They apply to the non-RSA keys, presently the
DSS and ElGamal keys.
The fingerprint is still a hash over the key data, but this time the
entire public key packet is hashed. This includes length fields which
make it effectively impossible to have two keys which hash to the same
fingerprint. Furthermore, the fingerprint hash function is changed
from MD5, which is now suspected to be weak, to SHA-1, a newer, larger,
and more secure hash. The new fingerprints are therefore 20 bytes long
compared to 16 bytes for the old ones.
KeyIDs for the new keys are defined to be the rightmost 64 bits of the
fingerprint. The only way to generate two new keys with the same keyID
would be to keep trying at random until two of them happened to match,
a much more difficult problem.
Phil Zimmermann has suggested that we generalize the keyID concept
to allow it to be variable length. Rather than fixing it as a 64 bit
substring of the fingerprint, we would make it a variable sized substring.
(Phil would also prefer that it be a left substring rather than a right
substring, for readability.)
This allows the software to be flexible about how specific it needs
to be when it refers to a key. For most purposes, the 64 bit keyID
is about right. To be completely specific, the keyID could be expanded
to be the full 160 bits of the fingerprint.
An interesting case arises where you might want the keyID to be smaller.
In some contexts it is a privacy leak that an encrypted message contains
the keyID of the key which can decrypt it. In that case you might want
the keyID to be just a few bits long.
(You'd probably want it to be a few bits rather than zero bits, because
the recipient frequently has more than one decryption key. Leaking a
few bits of keyID doesn't significantly hurt privacy, since there would
still be a very large number of keys which would match. But it will
simplify the decryption process, since it will generally still allow a
single decryption key to be tried.)
Doing this would allow us to merge the two concepts of keyID and key
fingerprint into one variable-sized keyID. The actual number of bits
shown would then depend on the context in which it was used.
It would be interesting to hear if there are other cases where being
able to vary the size and specificity of the keyID would be appropriate.
Are there any sizes besides small (about 3 bits, perhaps), medium (64
bits), and large (160 bits) which seem useful?
Hal Finney
hal(_at_)pgp(_dot_)com
hal(_at_)rain(_dot_)org