xsl-list
[Top] [All Lists]

Re: [xsl] Collect word count with xslt2.0 on saxon 8

2006-05-18 00:39:07
Hi Karen,

> It seems that the template rule would remove the node from the source

In XSLT the source is read-only, XSLT does not change the source in any way.

Have a look at the built-in template rules
http://www.w3.org/TR/xslt#built-in-rule

If you have an empty stylesheet that will actually use only built-in rules and will walk the source document and output all the text nodes.

The sample I posted used two modes so that we can walk the source tree with two sets of rules. The first set of rules in the default mode identify the elements whose text content we want to count and start a walking of the source from each identified element children using the second set of rules that are placed in the getText mode.
<xsl:apply-templates mode="getText" select="node()"/>
If we do not define any template for this mode then the built-in template rules will get the all the text nodes, but as we want to exclude elements with the specific mark then we just add a rule that matches them and does nothing. If you want also the comments to be excluded then you should add a rule for that:

<xsl:template match="comment()"  mode="getText"/>

Placing the call to apply-templates with the getText mode inside a variable redirects the output to that variable, then we use it to generate the required output on the template rules in the default mode.

Best regards,
George
---------------------------------------------------------------------
George Cristian Bina
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com


Karen McAdams wrote:

Hi George-

Thanks for the code sample. I would like to better understand how this 
apply-templates works-
   <xsl:template match="*[contains(@class, 'topic/topic')]"
     mode="getText"/>
       <xsl:apply-templates mode="getText" select="node()"/>

It seems that the template rule would remove the node from the source at the 
processing context yet with the apply templates it actually retrieves it- could 
you elaborate on this approach - I also need to remove from the node comment 
elements  before I count the text contents of the node -

If I add an template <xsl:template match="draft-comment" mode="getText"/>

I get  -
BUILD FAILED
java.lang.StackOverflowError

( I am building with Ant but this is an xsl related error . )

Karen

------------------------------

Date: Tue, 16 May 2006 10:04:22 +0300
To:  xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
From: George Cristian Bina <george(_at_)oxygenxml(_dot_)com>
Subject: Re: [xsl] Collect word count with xslt2.0 on saxon 8
Message-ID: <44697976(_dot_)1000108(_at_)oxygenxml(_dot_)com>

Hello Karen,

You can get the count of words more easily than that. First you can get the text in a variable that belongs to an element with topic/topic but not to other elements inside it with the same mark and then just count the words in that. For getting the text once we match on a topic/topic element we use a new mode for apply-template on which we do nothing on elements with topic/topic thus we exclude their text content.

The following stylesheet shows that

<xsl:stylesheet version="2.0"
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
   <xsl:output indent="yes"/>
   <xsl:template match="/">
     <counts>
       <xsl:apply-templates/>
     </counts>
   </xsl:template>
   <xsl:template match="text()"/>
   <xsl:template match="*[contains(@class, 'topic/topic')]">
     <xsl:variable name="text">
       <xsl:apply-templates mode="getText" select="node()"/>
     </xsl:variable>
     <record>
       <text>
         <xsl:value-of select="$text"/>
       </text>
       <count>
         <xsl:value-of
select="count(tokenize(lower-case($text),'(\s|[,.!:;]|[n][b][s][p][;])+')[string(.)])"
         />
       </count>
     </record>
     <xsl:apply-templates/>
   </xsl:template>

   <xsl:template match="*[contains(@class, 'topic/topic')]"
     mode="getText"/>
</xsl:stylesheet>


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--