Re(2): Comments on the Cryptographic Message Syntax

Russ

Some comments on the Cryptographic Message Syntax spec:

1. ASN.1 Canonical Encoding Rules should be allowed

                                                           Single-
   pass operation has one significant drawback; it is difficult to
   perform encode operations using the Distinguished Encoding Rules
   (DER) encoding in a single pass since the lengths of the various
   components may not be known in advance. Since DER encoding is
   required by the signed-data content type, an extra pass may be
   necessary when a content type other than data is encapsulated.


Why does the signed-data content type require DER? Why doesn't it allow
the choice of CER or DER, with the digest algorithm-identifier indicating
which? For example, SHA-1 on the CER encoding would have a different 
algorithm-identifier to SHA-1 on the DER encoding. [For anyone unfamiliar
with CER, it is defined in the 1994 version of ASN.1, and is identical to
DER except that CER uses indefinite length encoding (which permits 
single-pass encoding) and constructed encoding of strings.]


Does CER prohibit definite length encoding?  If not, there would be more
than one way to enocde the same structure.


Yes. CER requires indefinite length for constructed encodings, and prohibits 
definite length for constructed encodings. Like DER, CER gives a unique 
encoding of any structure.

2. There is an ambiguity about which octets are input to hash and

encryption algorithms

   The message digest calculation process computes a message digest on
   either the content being signed or the content together with the
   signer's authenticated attributes. In either case, the initial input
   to the message digest calculation process is the "value" of the
   content being signed. Specifically, the initial input is the content
   octets of the DER encoding of the content field of the ContentInfo
   value to which the signing process is applied. Only the contents
   octets of the DER encoding of that field are input to the message
   digest algoritm, not the identifier octets or the length octets.

   When the content being signed has content type data and the
   authenticatedAttributes field is absent, then just the value of the
   data (e.g., the contents of a file) is input to the message digest
   calculation. This has the advantage that the length of the content
   being signed need not be known in advance of the encryption process.

   Although the identifier octets and the length octets are not included
   in the message digest calculation, they are still protected by other
   means. The length octets are protected by the nature of the message
   digest algorithm since it is computationally infeasible to find any
   two distinct messages of any length that have the same message
   digest.


The first of the three paragraphs above says something different about 
which octets are input into the hash function than the implication of 
the second and third paragraphs. Consider the encoding of the content 
field of the ContentInfo in the following examples:

For the data content type:

Identifier      Length  Contents
[0]             nnn     
                       Identifier      Length  Contents
                       OctetString     nnn     xxxxxxxxx


For the signed-data content type:

Identifier      Length  Contents
[0]             nnn     
                       Identifier      Length  Contents
                       Sequence        nnn     
                                               Identifier      Length  
Contents
                                               Integer         1       1


The first paragraph above says that it is the contents octets of the 
content field which is input to the hash function - i.e. octets starting
with the identifier for OctetString in the case of the data content type
- but the second paragraph implies that it is just the contents octets 
of this OctetString. Should the first paragraph be changed to replace 
"content octets of the DER encoding of the content field" by "DER 
encoding of the content octets of the type within the content field"? 
That is, does the input to the hash start with the identifier for 
Integer in the case of signed-data?


I read the first paragrph three times.  I do not see the issue.  Maybe you
can try explaining it again....


Perhaps I'll try a concrete example to illustrate. Consider the data string 
"Hello" whose characters are represented by the octets 48656C6C6F. This is 
encoded into an ASN.1 Octet String as 040548656C6C6F. If this is a Data Content 
Type, it is encoded in the content field of a ContentInfo as A007040548656C6C6F.

The first paragraph says "the input [to the hash function] is the content 
octets of the DER encoding of the content field of the ContentInfo" - that 
means strip of the identifier and length octets from the encoding of the 
content field to give 040548656C6C6F as the input to the hash function. 
However, the second paragraph extracted above suggests that the input to the 
hash function is just 48656C6C6F.

3. User Keying Materials

Surely these should be included within the parameters component of the
algorithm-identifier for algorithms which require such material.


No.  The following structure is present for each recipient:

  RecipientInfo ::= SEQUENCE {
    version Version,
    rid RecipientIdentifier,
    keyEncryptionAlgorithm KeyEncryptionAlgorithmIdentifier,
    encryptedKey EncryptedKey }

If the UKM is carried in the KeyEncryptionAlgorithmIdentifier, then it is
carried several times, once for each recipient.  UKMs can be large (128
bytes), so we want to carry it once for all of the recipients that use that
form of key management.


You are adding additional overhead when there is only one recipient using that 
form of key management (as the algorithm-identifier is carried twice in your 
encoding) and some processing complexity in all cases (the receiver needs a 
local table to indicate for which algorithms to search for UKM) in order to 
achieve a small saving in message size when multiple recipients use the same 
form of key management.

You are also assuming that all algorithms that may use UKM will require the 
same UKM data for all recipients. This seems an unsafe assumption.

4. RecipientKeyIdentifier

RecipientKeyIdentifier is not an adequate identification of a recipient, 
since subjectKeyIdentifier is only unique within the scope of an 
individual recipient, and it is not accompanied by any other 
identification of the recipient. I suggest that the more general 
CertificateAssertion syntax from X.509 should be used instead of the 
restrictive subjectKeyIdentifier:

    RecipientKeyIdentifier ::= SEQUENCE {
         recipient-name                  Name,
         recipient-certificate-selector  CertificateAssertion}


Within the set of recipients, the subject key identifier should narrow the
"hits" significantly.  In the unlikely event that there is more than one
recipinet with the same subject key identifier, then the recipient can try
each "hit."


Why do you consider more than one recipient with the same subject key 
identifier to be an unlikely event? The subject key identifier might simply be 
an incremental counter of the certificates issued by a CA to each user, or the 
date on which the certificate was issued - in these cases several recipients 
having the same value would seem quite likely. Why not allow the recipient's 
name to be present?

5. Originator Certificate selection

Some key agreement algorithms use different originator keys for different
recipients. It is therefore necessary to add an originator-certificate-
selector to RecipientInfo:

    RecipientInfo ::= SEQUENCE {
         version Version,
         rid RecipientIdentifier,
         keyEncryptionAlgorithm KeyEncryptionAlgorithmIdentifier,
         encryptedKey EncryptedKey,
         originator-certificate-selector CertificateAssertion OPTIONAL }


Okay, but I think that the location within the structure might be
incorrect.  It needs to come sooner to accompdate single-pass processing.

OK


Please send me the ASN.1 for CertificateAssertion.


Attached


Russ


Jim


CertificateAssertion ::= SEQUENCE {
        serialNumber            [0] CertificateSerialNumber     OPTIONAL,
        issuer                  [1] Name                        OPTIONAL,
        subjectKeyIdentifier    [2] SubjectKeyIdentifier        OPTIONAL,
        authorityKeyIdentifier  [3] AuthorityKeyIdentifier      OPTIONAL,
        certificateValid        [4] UTCTime                     OPTIONAL,
        privateKeyValid         [5] GeneralizedTime             OPTIONAL,
        subjectPublicKeyAlgID   [6] OBJECT IDENTIFIER           OPTIONAL,
        keyUsage                [7] KeyUsage                    OPTIONAL,
        subjectAltName          [8] AltNameType                 OPTIONAL,
        policy                  [9] CertPolicySet               OPTIONAL,
        pathToName              [10] Name                       OPTIONAL }

AltNameType ::= CHOICE { 
        builtinNameForm ENUMERATED {
                                rfc822Name                      (1),
                                dNSName                         (2),
                                x400Address                     (3),
                                directoryName                   (4),
                                ediPartyName                    (5),
                                uniformResourceIdentifier       (6),
                                iPAddress                       (7),
                                registeredId                    (8) },
        otherNameForm   OBJECT IDENTIFIER }