xsl-list
[Top] [All Lists]

Re: [xsl] Collect word count with xslt2.0 on saxon 8

2006-05-19 01:16:43
Dear Karen,

I cannot see any reason why adding that empty template will cause the stylesheet to loop, or at least the stylesheet I posted. Anyway such loops are easily detected with a debugger. As you use Saxon 8 then you can try either with oXygen or with stylus. oXygen has a cycle detection option and by default it will notify you when the stack depth is greater than 300. Or you can post a cut down sample (XML and XSL) that has this problem.

Best Regards,
George
---------------------------------------------------------------------
George Cristian Bina
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com


Karen McAdams wrote:
HI George;

Thanks for your help. Naturally I tried the draft-comment template first
before emailing however I got an error when I tried that -
"Fatal Error! Too many nested apply-templates calls. The stylesheet is
probably looping." All I did was add the following to your stylesheet
<xsl:template match="draft-comment"  mode="getText"/>

Any ideas why this is happening?

Thanks.

-----Original Message-----
From: George Cristian Bina [mailto:george(_at_)oxygenxml(_dot_)com] Sent: Thursday, May 18, 2006 12:41 AM
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: Re: [xsl] Collect word count with xslt2.0 on saxon 8

Hi Karen,

 > It seems that the template rule would remove the node from the source

In XSLT the source is read-only, XSLT does not change the source in any
way.

Have a look at the built-in template rules
http://www.w3.org/TR/xslt#built-in-rule

If you have an empty stylesheet that will actually use only built-in rules and will walk the source document and output all the text nodes.

The sample I posted used two modes so that we can walk the source tree with two sets of rules. The first set of rules in the default mode identify the elements whose text content we want to count and start a walking of the source from each identified element children using the second set of rules that are placed in the getText mode.
<xsl:apply-templates mode="getText" select="node()"/>
If we do not define any template for this mode then the built-in template rules will get the all the text nodes, but as we want to exclude elements with the specific mark then we just add a rule that matches them and does nothing. If you want also the comments to be excluded then you should add a rule for that:

<xsl:template match="comment()"  mode="getText"/>

Placing the call to apply-templates with the getText mode inside a variable redirects the output to that variable, then we use it to generate the required output on the template rules in the default mode.

Best regards,
George
---------------------------------------------------------------------
George Cristian Bina
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com


Karen McAdams wrote:
Hi George-

Thanks for the code sample. I would like to better understand how this
apply-templates works-
   <xsl:template match="*[contains(@class, 'topic/topic')]"
     mode="getText"/>
       <xsl:apply-templates mode="getText" select="node()"/>

It seems that the template rule would remove the node from the source
at the processing context yet with the apply templates it actually
retrieves it- could you elaborate on this approach - I also need to
remove from the node comment elements  before I count the text contents
of the node -
If I add an template <xsl:template match="draft-comment" mode="getText"/>

I get  -
BUILD FAILED
java.lang.StackOverflowError

( I am building with Ant but this is an xsl related error . )

Karen

------------------------------

Date: Tue, 16 May 2006 10:04:22 +0300
To:  xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
From: George Cristian Bina <george(_at_)oxygenxml(_dot_)com>
Subject: Re: [xsl] Collect word count with xslt2.0 on saxon 8
Message-ID: <44697976(_dot_)1000108(_at_)oxygenxml(_dot_)com>

Hello Karen,

You can get the count of words more easily than that. First you can
get
the text in a variable that belongs to an element with topic/topic but

not to other elements inside it with the same mark and then just count

the words in that.
For getting the text once we match on a topic/topic element we use a
new
mode for apply-template on which we do nothing on elements with topic/topic thus we exclude their text content.

The following stylesheet shows that

<xsl:stylesheet version="2.0"
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
   <xsl:output indent="yes"/>
   <xsl:template match="/">
     <counts>
       <xsl:apply-templates/>
     </counts>
   </xsl:template>
   <xsl:template match="text()"/>
   <xsl:template match="*[contains(@class, 'topic/topic')]">
     <xsl:variable name="text">
       <xsl:apply-templates mode="getText" select="node()"/>
     </xsl:variable>
     <record>
       <text>
         <xsl:value-of select="$text"/>
       </text>
       <count>
         <xsl:value-of
select="count(tokenize(lower-case($text),'(\s|[,.!:;]|[n][b][s][p][;])+'
)[string(.)])"
         />
       </count>
     </record>
     <xsl:apply-templates/>
   </xsl:template>

   <xsl:template match="*[contains(@class, 'topic/topic')]"
     mode="getText"/>
</xsl:stylesheet>


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--