Russ
Some comments on the Cryptographic Message Syntax spec:
1. ASN.1 Canonical Encoding Rules should be allowed
Single-
pass operation has one significant drawback; it is difficult to
perform encode operations using the Distinguished Encoding Rules
(DER) encoding in a single pass since the lengths of the various
components may not be known in advance. Since DER encoding is
required by the signed-data content type, an extra pass may be
necessary when a content type other than data is encapsulated.
Why does the signed-data content type require DER? Why doesn't it allow
the choice of CER or DER, with the digest algorithm-identifier indicating
which? For example, SHA-1 on the CER encoding would have a different
algorithm-identifier to SHA-1 on the DER encoding. [For anyone unfamiliar
with CER, it is defined in the 1994 version of ASN.1, and is identical to
DER except that CER uses indefinite length encoding (which permits
single-pass encoding) and constructed encoding of strings.]
Does CER prohibit definite length encoding? If not, there would be more
than one way to enocde the same structure.
Yes. CER requires indefinite length for constructed encodings, and prohibits
definite length for constructed encodings. Like DER, CER gives a unique
encoding of any structure.
2. There is an ambiguity about which octets are input to hash and
encryption algorithms
The message digest calculation process computes a message digest on
either the content being signed or the content together with the
signer's authenticated attributes. In either case, the initial input
to the message digest calculation process is the "value" of the
content being signed. Specifically, the initial input is the content
octets of the DER encoding of the content field of the ContentInfo
value to which the signing process is applied. Only the contents
octets of the DER encoding of that field are input to the message
digest algoritm, not the identifier octets or the length octets.
When the content being signed has content type data and the
authenticatedAttributes field is absent, then just the value of the
data (e.g., the contents of a file) is input to the message digest
calculation. This has the advantage that the length of the content
being signed need not be known in advance of the encryption process.
Although the identifier octets and the length octets are not included
in the message digest calculation, they are still protected by other
means. The length octets are protected by the nature of the message
digest algorithm since it is computationally infeasible to find any
two distinct messages of any length that have the same message
digest.
The first of the three paragraphs above says something different about
which octets are input into the hash function than the implication of
the second and third paragraphs. Consider the encoding of the content
field of the ContentInfo in the following examples:
For the data content type:
Identifier Length Contents
[0] nnn
Identifier Length Contents
OctetString nnn xxxxxxxxx
For the signed-data content type:
Identifier Length Contents
[0] nnn
Identifier Length Contents
Sequence nnn
Identifier Length
Contents
Integer 1 1
The first paragraph above says that it is the contents octets of the
content field which is input to the hash function - i.e. octets starting
with the identifier for OctetString in the case of the data content type
- but the second paragraph implies that it is just the contents octets
of this OctetString. Should the first paragraph be changed to replace
"content octets of the DER encoding of the content field" by "DER
encoding of the content octets of the type within the content field"?
That is, does the input to the hash start with the identifier for
Integer in the case of signed-data?
I read the first paragrph three times. I do not see the issue. Maybe you
can try explaining it again....
Perhaps I'll try a concrete example to illustrate. Consider the data string
"Hello" whose characters are represented by the octets 48656C6C6F. This is
encoded into an ASN.1 Octet String as 040548656C6C6F. If this is a Data Content
Type, it is encoded in the content field of a ContentInfo as A007040548656C6C6F.
The first paragraph says "the input [to the hash function] is the content
octets of the DER encoding of the content field of the ContentInfo" - that
means strip of the identifier and length octets from the encoding of the
content field to give 040548656C6C6F as the input to the hash function.
However, the second paragraph extracted above suggests that the input to the
hash function is just 48656C6C6F.
3. User Keying Materials
Surely these should be included within the parameters component of the
algorithm-identifier for algorithms which require such material.
No. The following structure is present for each recipient:
RecipientInfo ::= SEQUENCE {
version Version,
rid RecipientIdentifier,
keyEncryptionAlgorithm KeyEncryptionAlgorithmIdentifier,
encryptedKey EncryptedKey }
If the UKM is carried in the KeyEncryptionAlgorithmIdentifier, then it is
carried several times, once for each recipient. UKMs can be large (128
bytes), so we want to carry it once for all of the recipients that use that
form of key management.
You are adding additional overhead when there is only one recipient using that
form of key management (as the algorithm-identifier is carried twice in your
encoding) and some processing complexity in all cases (the receiver needs a
local table to indicate for which algorithms to search for UKM) in order to
achieve a small saving in message size when multiple recipients use the same
form of key management.
You are also assuming that all algorithms that may use UKM will require the
same UKM data for all recipients. This seems an unsafe assumption.
4. RecipientKeyIdentifier
RecipientKeyIdentifier is not an adequate identification of a recipient,
since subjectKeyIdentifier is only unique within the scope of an
individual recipient, and it is not accompanied by any other
identification of the recipient. I suggest that the more general
CertificateAssertion syntax from X.509 should be used instead of the
restrictive subjectKeyIdentifier:
RecipientKeyIdentifier ::= SEQUENCE {
recipient-name Name,
recipient-certificate-selector CertificateAssertion}
Within the set of recipients, the subject key identifier should narrow the
"hits" significantly. In the unlikely event that there is more than one
recipinet with the same subject key identifier, then the recipient can try
each "hit."
Why do you consider more than one recipient with the same subject key
identifier to be an unlikely event? The subject key identifier might simply be
an incremental counter of the certificates issued by a CA to each user, or the
date on which the certificate was issued - in these cases several recipients
having the same value would seem quite likely. Why not allow the recipient's
name to be present?
5. Originator Certificate selection
Some key agreement algorithms use different originator keys for different
recipients. It is therefore necessary to add an originator-certificate-
selector to RecipientInfo:
RecipientInfo ::= SEQUENCE {
version Version,
rid RecipientIdentifier,
keyEncryptionAlgorithm KeyEncryptionAlgorithmIdentifier,
encryptedKey EncryptedKey,
originator-certificate-selector CertificateAssertion OPTIONAL }
Okay, but I think that the location within the structure might be
incorrect. It needs to come sooner to accompdate single-pass processing.
OK
Please send me the ASN.1 for CertificateAssertion.
Attached
Russ
Jim
CertificateAssertion ::= SEQUENCE {
serialNumber [0] CertificateSerialNumber OPTIONAL,
issuer [1] Name OPTIONAL,
subjectKeyIdentifier [2] SubjectKeyIdentifier OPTIONAL,
authorityKeyIdentifier [3] AuthorityKeyIdentifier OPTIONAL,
certificateValid [4] UTCTime OPTIONAL,
privateKeyValid [5] GeneralizedTime OPTIONAL,
subjectPublicKeyAlgID [6] OBJECT IDENTIFIER OPTIONAL,
keyUsage [7] KeyUsage OPTIONAL,
subjectAltName [8] AltNameType OPTIONAL,
policy [9] CertPolicySet OPTIONAL,
pathToName [10] Name OPTIONAL }
AltNameType ::= CHOICE {
builtinNameForm ENUMERATED {
rfc822Name (1),
dNSName (2),
x400Address (3),
directoryName (4),
ediPartyName (5),
uniformResourceIdentifier (6),
iPAddress (7),
registeredId (8) },
otherNameForm OBJECT IDENTIFIER }