-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 5.11.2011 19:18, Mark wrote:
However, for the archives, I ended up with this monstrosity that
does the job (ugly as it is):
It's usually better to extract such complicated code into user defined
function for easier reuse and readability.
<xsl:for-each-group select="Word" group-by="if
(lower-case(substring(@word,1,1)) eq 'č' or
lower-case(substring(@word,1,1)) eq 'ř' or
lower-case(substring(@word,1,1)) eq 'š' or
lower-case(substring(@word,1,1)) eq 'ž') then
lower-case(substring(@word,1,1)) else if
(lower-case(substring(@word,1,2)) eq 'ch') then 'ch' else
lower-case(substring(cps:remove-diacritics(@word), 1, 1))">
Personally, I prefer to store such exception rules (for example that
"č" shouldn't be treated as "c") in sort of a lookup table. Something
like:
<l:letters>
<l i="0">Symboly</l>
<l i="1">A</l>
<l i="1">a</l>
<l i="1">Á</l>
<l i="1">á</l>
<l i="2">B</l>
<l i="2">b</l>
<l i="3">C</l>
<l i="3">c</l>
<l i="4">Č</l>
<l i="4">č</l>
<l i="5">D</l>
<l i="5">d</l>
<l i="5">Ď</l>
<l i="5">ď</l>
<l i="7">E</l>
<l i="7">e</l>
<l i="7">É</l>
<l i="7">é</l>
<l i="7">Ě</l>
<l i="7">ě</l>
<l i="7">Ë</l>
<l i="7">ë</l>
<l i="8">F</l>
<l i="8">f</l>
<l i="9">G</l>
<l i="9">g</l>
<l i="10">H</l>
<l i="10">h</l>
<l i="11">Ch</l>
<l i="11">ch</l>
<l i="11">cH</l>
<l i="11">CH</l>
<l i="12">I</l>
<l i="12">i</l>
<l i="12">Í</l>
<l i="12">í</l>
<l i="13">J</l>
<l i="13">j</l>
<l i="14">K</l>
<l i="14">k</l>
<l i="15">L</l>
<l i="15">l</l>
<l i="16">M</l>
<l i="16">m</l>
<l i="17">N</l>
<l i="17">n</l>
<l i="17">Ň</l>
<l i="17">ň</l>
<l i="19">O</l>
<l i="19">o</l>
<l i="19">Ó</l>
<l i="19">ó</l>
<l i="19">Ö</l>
<l i="19">ö</l>
<l i="20">P</l>
<l i="20">p</l>
<l i="21">Q</l>
<l i="21">q</l>
<l i="22">R</l>
<l i="22">r</l>
<l i="23">Ř</l>
<l i="23">ř</l>
<l i="24">S</l>
<l i="24">s</l>
<l i="25">Š</l>
<l i="25">š</l>
<l i="26">T</l>
<l i="26">t</l>
<l i="26">Ť</l>
<l i="26">ť</l>
<l i="28">U</l>
<l i="28">u</l>
<l i="28">Ú</l>
<l i="28">ú</l>
<l i="28">Ů</l>
<l i="28">ů</l>
<l i="28">Ü</l>
<l i="28">ü</l>
<l i="29">V</l>
<l i="29">v</l>
<l i="30">W</l>
<l i="30">w</l>
<l i="31">X</l>
<l i="31">x</l>
<l i="32">Y</l>
<l i="32">y</l>
<l i="32">Ý</l>
<l i="32">ý</l>
<l i="33">Z</l>
<l i="33">z</l>
<l i="34">Ž</l>
<l i="34">ž</l>
</l:letters>
<xsl:function name="i:group-index" as="xs:integer">
<xsl:param name="term"/>
<xsl:variable name="long-letter-index"
select="document('')/*/l:letters/l[. = substring($term,1,2)]/@i"
as="xs:integer?"/>
<xsl:variable name="short-letter-index"
select="document('')/*/l:letters/l[. = substring($term,1,1)]/@i"
as="xs:integer?"/>
<xsl:variable name="letter-index">
<xsl:choose>
<xsl:when test="$long-letter-index">
<xsl:sequence select="$long-letter-index"/>
</xsl:when>
<xsl:when test="$short-letter-index">
<xsl:sequence select="$short-letter-index"/>
</xsl:when>
<xsl:otherwise>
<xsl:message>
No match for: <xsl:value-of select="$term"/>
</xsl:message>
<xsl:sequence select="0"/>
</xsl:otherwise>
</xsl:choose>
</xsl:variable>
<xsl:sequence select="xs:integer($letter-index)"/>
</xsl:function>
For your task you can then use simply:
<xsl:for-each-group select="Word" group-by="i:group-index(.)">
no xsl:sort is needed as function i:group-index() returns number
corresponding to proper collation order, including "ch"
Jirka
- --
- ------------------------------------------------------------------
Jirka Kosek e-mail: jirka(_at_)kosek(_dot_)cz http://xmlguru.cz
- ------------------------------------------------------------------
Professional XML consulting and training services
DocBook customization, custom XSLT/XSL-FO document processing
- ------------------------------------------------------------------
OASIS DocBook TC member, W3C Invited Expert, ISO JTC1/SC34 member
- ------------------------------------------------------------------
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iEYEARECAAYFAk66UfcACgkQzwmSw7n0dR5nUACfa3pitl0bcawc7TLpTdpBK6KA
kVcAoLjMgcLgYpA/+7aSSSiS+mBSDGoq
=bvaT
-----END PGP SIGNATURE-----
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--