[Top] [All Lists]

Re: draft-newman-i18n-collation-09.txt just posted

2006-05-16 11:33:37

Arnt Gulbrandsen writes:
Mark Davis writes:
At a quick glance, it appears that a number of comments have been incorporated.

It is possible that some of my changes don't satisfy you. I had conflicting requests for many things. Feel free to repeat, rephrase or add arguments.

In -10 (which I'll send off once I finish work this evening) I've made another few changes.

      > 2.4 Sort Keys

The use of the term "collation canonicalization" to refer to sort keys is very misleading. ...

Changed; the text now speaks of sort keys. I'm afraid there still are instances of the old term around, I found one today.

In -10, all should be dead.

The term 'error' is also problematic, since what is really at issue is a question of domain. For all those strings in the domain, either 'equal' or 'not_equal' should be returned from the equality function. For any string not in the domain, 'undefined' should be returned.

Not changed. Back in February, I agreed that "error" was not ideal, but did not see "undefined" as better, and could not find a really apt term. The collations were a little too well-defined in the "undefined" cases then.

However, in -10, I think they really will be undefined outside their domain, so I'll change to using "undefined" instead of "error". (I'm removing the bits that fall back to i;octet.)

Changed. The fallback to i;octet is now in the server, if the protocol requires it.

This means that if a server can escape implementing i;octet, it can keep all its strings in UCS-2 or UCS-4 internally, even as it implements collations which are defined in terms of octets.