Re: [smime] [pkix] Support for email address internationalization in RFC

On Feb 7, 2016, at 12:15 PM, Wei Chuang <weihaw(_at_)google(_dot_)com> wrote:



On Fri, Feb 5, 2016 at 4:46 PM, Peter Bowen <pzbowen(_at_)gmail(_dot_)com 
<mailto:pzbowen(_at_)gmail(_dot_)com>> wrote:
On Thu, Feb 4, 2016 at 11:05 AM, Wei Chuang <weihaw(_at_)google(_dot_)com 
<mailto:weihaw(_at_)google(_dot_)com>> wrote:

PKIX community,

We've observed a limitation for specifying internationalized email addresses
as the local part which is restricted to essentially ASCII.  That is subject
or issuer email addresses which should be stored as subject-alt-name or
issuer-alt-name rfc822Name and are encoded as IA5String.  This is despite
the internationalization in email usage as specified by internationalization
of email headers in RFC6532 allowing Unicode in To, From, etc fields and
becoming fairly commonplace.  RFC5280 already specifies internationalization
of the domain but lacks any specification for the local-part.


Up until now, I have tried to lay low on this topic. However, having reviewed 
the relevant standards and implementations in the field, I have my 22¢:

The proposed methods are to create an otherName form and assign a new object 
identifier for it (A. Melnikov, ed., draft-ietf-pkix-eai-addresses-00), and to 
encode the local part in base64 with “:” as an escape signal (L. Baudoin, et. 
al., draft-lbaudoin-iemax-02). There is also a counterproposal on the agenda, 
which I will label as #3, to make rfc822Name a CHOICE {IA5String, UTF8String}. 
There are two other methods that deserve serious consideration. My 0.2¢ is on 
#4 and my 21.8¢ is on #5:

#4 Extend GeneralName with a new name type:

GeneralName ::= CHOICE {
  otherName [0] INSTANCE OF OTHER-NAME,
  rfc822Name [1] IA5String,
  dNSName [2] IA5String,
  x400Address [3] ORAddress,
  directoryName [4] Name,
  ediPartyName [5] EDIPartyName,
  uniformResourceIdentifier [6] IA5String,
  iPAddress [7] OCTET STRING,
  registeredID [8] OBJECT IDENTIFIER,
  eaiName [9] UTF8String
  ... }

The advantage of this approach is that it conforms to X.509:2012, which uses … 
syntax to show that the CHOICE is extensible. However, the IETF invented 
GeneralName (RFC 2459), and the latest ASN.1 (RFC 5912) does not use … syntax 
for extensibility. (Basically I think most implementations would barf on this 
CHOICE, and would cause the overall ASN.1 decoding op to fail, meaning all 
places where GeneralName is directly encoded, would cause implementations to 
barf.)

#5 Change GeneralName so that rfc822Name is actually just UTF8String:

   GeneralName ::= CHOICE {
        otherName                   [0]  INSTANCE OF OTHER-NAME,
        rfc822Name                  [1]  UTF8String,
        dNSName                     [2]  IA5String,
        x400Address                 [3]  ORAddress,
        directoryName               [4]  Name,
        ediPartyName                [5]  EDIPartyName,
        uniformResourceIdentifier   [6]  IA5String,
        iPAddress                   [7]  OCTET STRING,
        registeredID                [8]  OBJECT IDENTIFIER
   }

GeneralName is in the IMPLICIT TAGS part of PKIX. That means that on the wire, 
a GeneralName will (almost always) just be serialized as the
application tag in the choice, followed by the length and the data. The 
counterproposal of a CHOICE {IA5String, UTF8String} is flawed in that it will 
force ALL rfc822Names to include an additional tag UNIVERSAL 22 in the case of 
IA5String, because the choice is ambiguous without the tag (so a proper ASN.1 
compiler will force the serialization and de-serialization of the tag). Note: 
UTF8String (in a CHOICE) would force serialization of the tag UNIVERSAL 12.

With this proposal #5, UTF8String is just a superset of IA5String. Therefore, 
new implementations will “just work” with virtually no further coding. The 
high-octet data in UTF8String will violate expectations for older 
implementations that are looking for IA5String. But enforcement of octets 00-7F 
is almost never done in the decoding step, or if it is done, it does not cause 
the entire ASN.1 decoding op to fail. (Note: this would be an “ASN.1 value 
constraint violation.”) If most implementations will continue to decode the 
ASN.1 and simply skip over what it perceives to be “invalid ASCII” (or simply 
rejects that particular alternative when doing name comparisons), we are good 
to go. This basically mirrors the way that EAI itself works in RFCs 6530-6532.

To test this, one would want to construct a signed certificate with “invalid” 
IA5String data that actually contains valid Unicode octets, and see what 
happens with various implementations.

I am not saying that this is the “right” approach, but I do think that it 
deserves serious consideration when evaluating alternatives. An example of an 
advantage is that it should preserve name constraints with no additional coding.

Regards,

Sean

_______________________________________________
smime mailing list
smime(_at_)ietf(_dot_)org
https://www.ietf.org/mailman/listinfo/smime

Re: [smime] [pkix] Support for email address internationalization in RFC5280 certificates