At 2009-05-26 14:07 +0100, tom tom wrote:
I have a string containing a mix of Chinese and
Latin characters, eg ªü®Ú§Ê¥Ò«¬H1N1¬y·PÚÌ.
I wish to return a nodeset containing the following kind of structure:
ªü®Ú§Ê¥Ò«¬
H1N1
¬y·PÚÌ
Where H1N1 falls into the BasicLatin unicode
character block and the other two strings
can be categorized as CJKUnifiedIdeographs.
Given
http://en.wikipedia.org/wiki/Basic_Latin_unicode_block
defines the characters up to the tilde, this can
be done with a character range.
Can anyone suggest the cleanest way to do this using XSLT 2?
I like Michael's and David's suggestion better to
use Unicde classes, but below is what I threw together quickly.
. . . . . . . . . . Ken
T:\ftemp>type tom.xml
<doc>?????H1N1??</doc>
T:\ftemp>call xslt2 tom.xml tom.xsl
<?xml version="1.0"
encoding="UTF-8"?><doc><other>?????</other><latin>H1N1</latin><other>??</other></doc>
T:\ftemp>type tom.xsl
<?xml version="1.0" encoding="US-ASCII"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="2.0">
<xsl:template match="text()" priority="1">
<xsl:analyze-string select="." regex="[!-~]+">
<xsl:matching-substring>
<latin><xsl:value-of select="."/></latin>
</xsl:matching-substring>
<xsl:non-matching-substring>
<other><xsl:value-of select="."/></other>
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:template>
<xsl:template match="@*|node()"><!--identity for all other nodes-->
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
T:\ftemp>rem Done!
--
XSLT/XSL-FO/XQuery hands-on training - Los Angeles, USA 2009-06-08
Crane Softwrights Ltd. http://www.CraneSoftwrights.com/s/
Training tools: Comprehensive interactive XSLT/XPath 1.0/2.0 video
Video lesson: http://www.youtube.com/watch?v=PrNjJCh7Ppg&fmt=18
Video overview: http://www.youtube.com/watch?v=VTiodiij6gE&fmt=18
G. Ken Holman mailto:gkholman(_at_)CraneSoftwrights(_dot_)com
Male Cancer Awareness Nov'07 http://www.CraneSoftwrights.com/s/bc
Legal business disclaimers: http://www.CraneSoftwrights.com/legal
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--