Re: PKCS#11 URI slot attributes & last call

On 31 dec 2014, at 08:03, Nico Williams <nico(_at_)cryptonector(_dot_)com> 
wrote:

On Wed, Dec 31, 2014 at 07:29:47AM +0100, Patrik Fältström wrote:

On 30 dec 2014, at 20:53, Nico Williams <nico(_at_)cryptonector(_dot_)com> 
wrote:
Better say
nothing, because I think the thing to do is obvious enough, but if we
must say anything, it's that the various strings (e.g., token manuf)
are to be compared normalization-insensitively.


Sorry, but I have not heard the term "normalization-insensitively" before.

Can you explain what you mean?


Notionally, if you're comparing two unnormalized strings, you could
normalize both then compare the two normalized strings.

Of course, that can be inefficient (e.g., if it means allocating memory,
of if they will prove not equal in the first few codepoints) or
infeasible (e.g., if one of the strings is actually a hashed key to a
hash table).

What you can for the first case is compare code-unit by code-unit, with
a fast path for the cases that need no normalization, and normalizing
one character (but possibly multiple codepoints, of course) at a time.
This limits the total memory consumption, and anyways, for the common
case you can often expect an inequality result long before you're done
traversing the shorter string.  This is (can be, if you do it right), of
course, equivalent to normalizing both strings then comparing -- but it
should usually be much faster.

For the second case the thing to do is to normalize the key at hash
time, naturally.

[ZFS, incidentally, supports this for filesystem object names, and has
for years now.]

Now, PKCS#11 nowadays supports UTF-8 for things like "token label", but
it doesn't say anything about form -- why should it (see below)?

But where PKCS#11 URIs are intended to _match_ PKCS#11 resources by
name... apps will need to care about normalization.  In practice, like a
great many applications, doing nothing about normalization will probably
work fine (until the day that it doesn't).  But saying anything about
this could be tricky: what if there are two tokens with equivalent
labels, just in different forms?  Fortunately PKCS#11 URIs can match on
more attributes than labels, so there's that.

PKCS#11 should say "don't do that" or "don't do that, normalize to NFC"
(or NFD, whatever), but doesn't (or I didn't find where it does, if it
does), so the most that this document could say is "compare
normalization-insensitively where possible".


Ok, so what you say is that the side that is to calculate whether there is a 
match or not can do whatever normalization they want on the string(s)? Or do 
you say that whoever is doing a match is to not do normalization at all as the 
application (on client side) can and should define what normalization (in a 
broader sense, not only Unicode Normalization) must be possible to define?

In IDNA2008, as you know, we did choose the latter, but recommend applications 
to define what normalization to do, and that NFC is the Unicode Normalization 
to use.

   Patrik

signature.asc
Description: Message signed with OpenPGP using GPGMail