xsl-list
[Top] [All Lists]

Re: xsl:sort with msxml english language, danish characters, weird results

2004-10-25 11:38:27
Michael Kay wrote:

I'm not sure I'm following here--at least using Java RuleBasedCollator you should be able to achieve any collation sequence whatsoever.

But I'm not sure what you mean by sorting 646 before 10646.



A possible algorithm is that any sequence of digits counts as a single
collation unit, which is collated before the first collation unit derived
from non-digit characters, and has a collation value equal to its decimal
value.

I don't believe you can achieve this with a RuleBasedCollator.

Ah, I understand now--I misunderstood your comment as being about the standards, not the strings "646" and "10646".

I think you are correct, although I'll have to test it.

Of course, this type of rule can be implemented using a custom Comparator implementation that implements whatever rule you want, delegating the character-level comparison to a rule-based collator. I don't think there's any way that a purely declarative mechanism, which is what I understand the UCA to define (and what RuleBasedCollator implements) to handle all cases.

Cheers,

E.
--
W. Eliot Kimber
Professional Services
Innodata Isogen
9390 Research Blvd, #410
Austin, TX 78759
(512) 372-8122

eliot(_at_)innodata-isogen(_dot_)com
www.innodata-isogen.com