xsl-list
[Top] [All Lists]

Re: [xsl] for-each-group grouping accented versions of letters together

2012-04-21 09:36:28
On Sat, Apr 21, 2012 at 03:02:22AM +0200, Imsieke, Gerrit, le-tex scripsit:
You can strip the accents by unicode decomposition and then removing
the diacritical marks:

<xsl:for-each-group select="index-0"
  group-by="substring(
              upper-case(
                replace(
                  normalize-unicode(heading, 'NFKD'),
                  '[&#x300;-&#x36f;]',
                  ''
                )
              ), 1, 1
            )">
  <xsl:sort select="current-grouping-key()"/>

Thank you!

I had tried decomposing, using replace with \p{Lm} and then recomposing
with NFKC, and that didn't work, but it was also fairly late on Friday
afternoon.

When writing the group (= starting letter) to an output file further
down in you template, you should sort it according to the
upper-case(…) part as first sort key, then according to the actual
heading as a second (tie-breaker) sort key.

So it’s best to make a function (call it, e.g., my:sortkey) out of
upper-case(…).

Yes.

In that function, you can also do other useful stuff, such as
eliminating stop words or replacing all numbers with a zero, so that
everything that starts with a number will be in the same group.

Fortunately these are very uncomplicated headings, so no stop words, but
the point about numbers is very well taken.

Thanks!
Graydon

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--