Yes, there is a better way. You can use normalize-unicode() to turn the
string into decomposed normal form, in which all the diacritics become
separate characters, and then you can use replace() to get rid of the
diacritics:
replace(normalize-unicode($in, 'NFD'), '\p{IsCombiningDiacriticalMarks}',
'')
Regards,
Michael Kay
http://www.saxonica.com/
http://twitter.com/michaelhkay
-----Original Message-----
From: Mark Wilson [mailto:mark(_at_)knihtisk(_dot_)org]
Sent: 16 November 2009 05:35
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: [xsl] Does XSLT contain an easy means of determining
if a string contains a diacritic?
Hi,
I need to render Czech language strings containing diacritics
into strings with the diacritics removed. The Czech alphabet
has 16 lower case diacritics and a somewhat smaller set of
upper case diacritics. The strings are expressed in UTF-8. I
do not need to retain case, but I must locate and replace all
diacritics.
My only plan so far is construct a gigantic <xsl:choose> to
find strings containing at least one diacritic. Then I would
need a gigantic <xsl:if> to change each diacritic into its
unaccented counterpart.
I wonder if there is a simpler method for turning, for
example, a word like "Safarík" [S, r, í] into Safarik? Any
ideas or suggestions, Thanks, Mark
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail:
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--