Re: i18n requirements (was: Re: NF* (Re: PKCS#11 URI slot attributes & l

On Sat, 3 Jan 2015, Jan Pechanec wrote:

        hi, I haven't received any other comments on the draft 
recently (I know the LC already ended on Dec 29 though) so I think I 
can file changes discussed and drafted in this thread as draft 18 on 
Friday.  Thank you all for feedback, I really appreciate it.

        one more change for the draft 18 (v2 attached) is to spell 
"NFC" and reference the Unicode Annex on normalization based on 
comments from Jaroslav and Christian.

    In order to improve the user experience, applications that create
-   PKCS#11 objects or label tokens, SHOULD normalize labels to NFC.  For
-   the same reason PKCS#11 libraries, slots (token readers), and tokens
...
+   PKCS#11 objects or label tokens SHOULD normalize labels to
+   Normalization Form C (NFC) [UAX15].  For the same reason PKCS#11
+   libraries, slots (token readers), and tokens SHOULD normalize their
...

+   [UAX15]    Davis, M., Ed., Whistler, K., Ed., and Unicode Consortium,
+              "Unicode Standard Annex #15 - Unicode Normalization Forms,
+              Version Unicode 7.0.0", June 2014.

        best regards, Jan.

On Thu, 1 Jan 2015, Nico Williams wrote:

      Nico, many thanks for the drafted text and also to Patrik and 
John for discussing it.

      I've updated the draft in sections on URI matching guidelines,
URI comparision, added a new section on I18n, and added a new paragraph
to the Security considerations.  Individial diffs inline, a draft for
new draft 18 attached (draft-pechanec-pkcs11uri-18-v1.txt).

I think we could use some text like this:

  PKCS#11 does not specify a canonical from for UTF-8 string slots in
  the API.  This presents the usual false negative and false positive
  (aliasing) concerns that arise when dealing with unnormalized
  strings.  Because all PKCS#11 items are local and local security is
  assumed, these concerns are mainly about usability.

  In order to improve the user experience, applications that create
  PKCS#11 objects or otherwise label tokens, SHOULD normalize labels to
  NFC.  For the same reason PKCS#11 libraries, slots (token readers),
  and tokens SHOULD normalize their names to NFC.  When listing
  libraries, slots, tokens, or objects, an application SHOULD normalize
  their names to NFC.  When matching PKCS#11 URIs to libraries, slots,
  tokens, and/or objects, applications may use form-insensitive Unicode
  string comparison for matching, as the objects might pre-date these
  recommendations).


      I've created "Internationalization Considerations" section and 
put the text above there after I slightly modified it.  I wanted to
mention CK_UTF8CHAR type so that it's clear what is discussed.

768    6.  Internationalization Considerations

770       The PKCS#11 specification does not specify a canonical form for
771       strings of characters of the CK_UTF8CHAR type.  This presents the
772       usual false negative and false positive (aliasing) concerns that
773       arise when dealing with unnormalized strings.  Because all PKCS#11
774       items are local and local security is assumed, these concerns are
775       mainly about usability.

777       In order to improve the user experience, applications that create
778       PKCS#11 objects or label tokens, SHOULD normalize labels to NFC.  For
779       the same reason PKCS#11 libraries, slots (token readers), and tokens
780       SHOULD normalize their names to NFC.  When listing PKCS#11 libraries,
781       slots, tokens, and/or objects, an application SHOULD normalize their
782       names to NFC.  When matching PKCS#11 URIs to libraries, slots,
783       tokens, and/or objects, applications MAY use form-insensitive Unicode
784       string comparison for matching, as those might pre-date these
785       recommendations.  See also Section 3.5.

      in section 3.5 on URI Matching Guidelines, I've added the
following as the last paragraph of the section (it was based on John's
note from his last email).  This paragraph might not be necessary there
and the first part could be moved to the I18N section but I think it's
good to put it to where attribute matching is discussed so that it is
not easily overlooked.

513       As noted in Section 6, the PKCS#11 specification is not clear about
514       how to normalize UTF-8 encoded Unicode characters [RFC2279].  Those
515       who discover a need to use characters outside the ASCII repertoire
516       should be cautious, conservative, and expend extra effort to be sure
517       they know what they are doing and that failure to do so may create
518       both operational and security risks.  It means that when matching
519       UTF-8 string based attributes (see Table 1) with such characters,
520       normalizing all UTF-8 strings before string comparison may be the
521       only safe approach.  For example, for objects (keys) it means that
522       PKCS#11 attribute search template would only contain attributes that
523       are not UTF-8 strings and another pass through returned objects is
524       then needed for UTF-8 string comparison after the normalization is
525       applied.

Then later in the security considerations section, add something like:

  PKCS#11 does not authenticate devices to users; PKCS#11 only
  authenticates users to tokens.  Instead, local and physical security
  are demanded: the user must be in possession of their tokens, and
  system into whose slots the users' tokens are inserted must be
  secure.  As a result, the usual security considerations regarding
  normalization do not arise.  For the same reason, confusable script
  issues also do not arise.  Nonetheless, it is best to normalize to
  NFC all strings appearing in PKCS#11 API elements.


      I've added the following to the Security Considerations 
section (again, slightly modified, I'd rather not use "PKCS#11" as 
an alias for the specification):

807       The PKCS#11 specification does not provide means to authenticate
808       devices to users; it only allows to authenticate users to tokens.
809       Instead, local and physical security are demanded: the user must be
810       in possession of their tokens, and system into whose slots the users'
811       tokens are inserted must be secure.  As a result, the usual security
812       considerations regarding normalization do not arise.  For the same
813       reason, confusable script issues also do not arise.  Nonetheless, it
814       is best to normalize to NFC all strings appearing in PKCS#11 API
815       elements.  See also Section 6.

      on top of that, I've added the following sentence to 3.6. PKCS#11 URI
Comparison section:

532       strictly avoiding false positives.  When working with UTF-8 strings
533       with characters outside the ASCII character sets, see important
534       caveats in Section 3.5 and Section 6.

      the attribute Table 1 now also states which attributes are 
UTF-8 strings so that it's clear without consulting the spec.

      thank you, Jan.


-- 
Jan Pechanec <jan(_dot_)pechanec(_at_)oracle(_dot_)com>

draft-pechanec-pkcs11uri-18-v2.txt
Description: Text document

Re: i18n requirements (was: Re: NF* (Re: PKCS#11 URI slot attributes & last call))