On Tue, 30 Dec 2014, Nico Williams wrote:
Better not even think about saying anything about normalization,
right? PKCS#11 nowadays supports UTF-8 for the strings we care about,
but says nothing about normalization. I suppose you could say that
matching should be (lowercase) normalization-insensitive. In practice
it will never matter (which is why the lowercase).
hi Nico, I assume you talk about case normalization now. I
also agree we need not to say anything about it - and we don't aside
from "case normalization" as defined in 6.2.2.1 of RFC 3986 where only
the following sections are relevant to us:
For all URIs, the hexadecimal digits within a percent-encoding
triplet (e.g., "%3a" versus "%3A") are case-insensitive and therefore
should be normalized to use uppercase letters for the digits A-F.
and:
The other generic syntax components are assumed to be
case-sensitive unless specifically defined otherwise by the scheme
(see Section 6.2.3).
I also understand that technically it is not specified which
pairs form lower-upper character relationship and that in some
alphabets not everybody agrees on what the uppercase version of a
specific character is. I can also see a report that perl and GNU libc
conversion routines work differently and that it may not be simply a
bug in one or the other.
so I'd rather compare UTF-8 strings literaly and assume that
producers of PKCS#11 URIs will use exactly what they get via the
PKCS#11 API and that consumers will not apply any case normalization
post-processing on such URIs.
cheers, Jan.
--
Jan Pechanec <jan(_dot_)pechanec(_at_)oracle(_dot_)com>