ietf
[Top] [All Lists]

Re: Last Call: <draft-faltstrom-5892bis-04.txt> (The Unicode code points and IDNA - Unicode 6.0) to Proposed Standard

2011-05-29 13:30:44
John C Klensin <john-ietf(_at_)jck(_dot_)com> writes:

--On Sunday, May 29, 2011 08:58 +0200 Simon Josefsson
<simon(_at_)josefsson(_dot_)org> wrote:

in a Unicode 6.0 environment, evaluate U+19DA as PVALID and
therefore not raise that error, then it is not "compliant"
with RFC 5892, irrelevant of the "Updates" status of the
present document.

I don't see how.

My code uses the tables from RFC 5892 which were generated in
an Unicode 5.2 environment.  My IDNA2008 code may eventually
run in an Unicode 6.0 environment, or any other future version
of Unicode.  I can't control the Unicode version used, and
from what I understand this is one of the features of
IDNA2008.  Implementations need not lock down the Unicode
version to a single Unicode version, as they had to do for
IDNA2003.

It seems to me that this is exactly where we are having a
misunderstanding.   In terms of determining conformance, those
tables are not normative, so it is not possible to say "I
implemented the tables in RFC 5892 and therefore I conform to
the standard".  The closest you can get would be to say "I
implemented the rules and tested against the tables when those
rules were applied to Unicode 5.2 and therefore have great
confidence in my implementaton", but conformance statements stop
with "implemented the rules correctly".  

For practical reasons, we expect to see production
implementations using tables or other abstractions of the rules
that are somewhat pre-compiled, not applying the rule set each
time.   One consequence of this is that a given table-based
implementation is inevitably dependent on versions of Unicode
even if the Standard (and its conformance requirements) is not.

Right, and that describes my implementation.  There is no difference in
behaviour of an implementation that uses the informative tables in RFC
5892 directly or one that pre-computes the table at compile time using
Unicode 5.2.  The data and output are the same in both cases.  So I
don't follow where you think the misunderstanding is?  I agree with what
you say here.

If this model is not permitted, I believe there are bigger
problems.

To avoid doubt, and to back up your assertment that my
implementation is non-compliant, please point to the "MUST" or
"SHOULD" in RFC 5892 that forbis this, to me, logical
implementation approach.

The key is the text in Section 4 that says:

      "The table in Appendix B shows, for illustrative
      purposes, the consequences of the categories and
      classification rules, and the resulting property values.
      
      "The list of code points that can be found in Appendix B
      is non-normative.  Sections 2 and 3 are normative."

It seems to me that is very clear about the relationship between
the rules and the tables.   That relationship is reiterated in
Section 7.1.1 of RFC 5892.

s/5892/5894/

Sure.  But that does not prove (or disprove) Pete's claim that my
implementation is non-compliant.

You could reasonably say that your implementation is conformant
but current only to Unicode 5.2.   If you are willing to say
that, I guess you don't need to change anything.

I claim my implementation is compliant to all requirements in RFC 5890,
RFC 5891, RFC 5892 and RFC 5893.

While we recognize that you have no control over the Unicode version
in use, good sense suggests that systems will update versions of
Unicode (including all of the associated tables and support routines
as applicable) and versions of your library together,

That is unrealistic.  Traditional operating systems are already so
complex that upgrading them to one Unicode versions across all software
pieces (Java, Perl, SQL databases, web browsers, word processors, etc)
is economically infeasible.

Modern operating system rely so much on network services that it is not
even useful to decouple the local system from external systems.
Essentially "the system" is identical to "the Internet".  A flag day to
upgrade to the latest Unicode version across the Internet is, despite
how infinitely pleasant that would be, impossible.

If it was possible to upgrade software components to the latest Unicode
version in a controlled way, the IDNA2003 model would have worked fine.

Fortunately, I believe IDNA2008 does not require tight Unicode version
synchronization.  In fact, I believe one of the features with IDNA2008
is exactly that it doesn't require all Unicode versions to be in sync in
all parts of the Internet.

While that should be clear from the context of the discussions in RFC
5891 and 5892, RFC 5894 is quite explicit about it in the second
bullet of Section 7.1.2:

 "o The Unicode tables (i.e., tables of code points,
      character classes, and properties) and IDNA tables
      (i.e., tables of contextual rules such as those
      that appear in the Tables document), must be
      consistent on the systems performing or validating
      labels to be registered.  Note that this does not
      require that tables reflect the latest version of
      Unicode, only that all tables used on a given
      system are consistent with each other."

That is about registration of labels, not lookup.  Registration is a
centralized process where you can control the software used more easily.

Similarly, the first bullet of 7.1.3 reads:

You forgot to quote the paragraph before the one you quoted:

   Any application processing a label through IDNA so it can be looked
   up in a DNS zone is required to (the exact rules appear in Section 5
   of the Protocol document [RFC5891]):

 "o Maintain IDNA and Unicode tables that are consistent
      with regard to versions, i.e., unless the application
      actually executes the classification rules in the Tables
      document [RFC5892], its IDNA tables must be derived from
      the version of Unicode that is supported more generally on
      the system.  As with registration, the tables need not
      reflect the latest version of Unicode, but they must be
      consistent."

I don't see any similar text about IDNA and Unicode version consistency
requirements in section 5 of RFC 5892.  I'm sure you recall that RFC
5894 is as non-normative as the RFC 5892 tables are.

/Simon
_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www.ietf.org/mailman/listinfo/ietf