Here's another 1.0 stylesheet (Dimitre, gave you one):
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:output method="xml" indent="yes" />
<xsl:template match="/">
<words>
<xsl:call-template name="tokenize">
<xsl:with-param name="str" select="text" />
</xsl:call-template>
</words>
</xsl:template>
<xsl:template name="tokenize">
<xsl:param name="str" />
<xsl:choose>
<xsl:when test="contains($str, ' ')">
<word>
<xsl:value-of select="substring-before($str, ' ')" />
</word>
<xsl:call-template name="tokenize">
<xsl:with-param name="str" select="substring-after($str, '
')" />
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<word>
<xsl:value-of select="$str" />
</word>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
This stylesheet implements tokenizing algorithm from scratch (and also
with a limitation, that tokenizing delimiter can be a single ' '
character). If you can use XSLT 2.0, you might prefer Ken's solution,
as XPath 2.0 has native tokenizing function. XPath 2.0 tokenizing
function allows you to, use say a regular expression specifier '\s+'
as a tokenizing delimiter.
On Thu, Oct 29, 2009 at 2:14 AM, Larry Hayashi <lhtrees(_at_)gmail(_dot_)com>
wrote:
Is there a function in XSLT 1.1 that will extract words from a string?
I'd like to be able to take <text>Jill ran up the hill.</text> and get
the following:
<words>
<word>Jill</word>
<word>ran</word>
<word>up</word>
<word>the</word>
<word>hill.</word>
</words>
Thanks,
Larry
--
Regards,
Mukul Gandhi
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--