Well, scratch that "wouldn't want to" part after all. I guess I'm a sucker for
interesting problems with unconventional solutions. :)
Turns out MSXML doesn't optimize the classic str:replace pattern for tail
recursion after all, and it bombs out with a stack overflow after ~550
replacements (1 iteration per replacement), in my experiment. The following
template worked for 400000+ replacements by the time I stopped the test:
<!-- Replaces all instances of $search in $string with $replace. -->
<xsl:template name="str:replace">
<xsl:param name="string" select="''" />
<xsl:param name="search" select="''" />
<xsl:param name="replace" select="''" />
<xsl:variable name="half" select="floor(string-length($string) div 2)"
/>
<xsl:variable name="midpoint1" select="$half - (string-length($search)
- 1)" />
<xsl:variable name="midpoint2" select="$half + (string-length($search)
- 1)" />
<xsl:variable name="mid-string" select="substring($string, $midpoint1 +
1, $midpoint2 - $midpoint1)" />
<xsl:choose>
<xsl:when test="not(string($string) and string($search) and
contains($string, $search))">
<xsl:value-of select="$string" />
</xsl:when>
<xsl:when test="string-length($string) <
string-length($search) * 2">
<xsl:value-of select="concat(substring-before($string,
$search), $replace, substring-after($string, $search))" />
</xsl:when>
<xsl:when test="$mid-string and contains($mid-string, $search)">
<xsl:call-template name="str:replace">
<xsl:with-param name="string"
select="substring($string, 1, $midpoint1)" />
<xsl:with-param name="search" select="$search"
/>
<xsl:with-param name="replace"
select="$replace" />
</xsl:call-template>
<xsl:call-template name="str:replace">
<xsl:with-param name="string"
select="$mid-string" />
<xsl:with-param name="search" select="$search"
/>
<xsl:with-param name="replace"
select="$replace" />
</xsl:call-template>
<xsl:call-template name="str:replace">
<xsl:with-param name="string"
select="substring($string, $midpoint2 + 1)" />
<xsl:with-param name="search" select="$search"
/>
<xsl:with-param name="replace"
select="$replace" />
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:call-template name="str:replace">
<xsl:with-param name="string"
select="substring($string, 1, $half)" />
<xsl:with-param name="search" select="$search"
/>
<xsl:with-param name="replace"
select="$replace" />
</xsl:call-template>
<xsl:call-template name="str:replace">
<xsl:with-param name="string"
select="substring($string, $half + 1)" />
<xsl:with-param name="search" select="$search"
/>
<xsl:with-param name="replace"
select="$replace" />
</xsl:call-template>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
Try it out, I didn't get a chance to try it in larger and complex scenarios.
I'd welcome any bug reports, tweaks, comments, suggestions, etc. :)
~ Scott
-----Original Message-----
From: Scott Trenda [mailto:Scott(_dot_)Trenda(_at_)oati(_dot_)net]
Sent: Friday, March 02, 2012 7:36 AM
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: RE: [xsl] What is the Maximum number of recurssions
I encountered this recently, when I had to retool a str:rtrim template to use
bisection because the previous step-through-the-whitespace method was bombing
out on large data sets.
Out of curiosity, has anyone thought about a bisection approach for str:replace
(XSLT 1.0)? I was able to use bisection for my str:rtrim template, but only
because the character I was looking for (' ') was only one character long. Is
it possible to use bisection when searching for a dynamic-length string in the
subject string? The basic problem theoretically is that you don't want to
bisect the subject string at a point that's in the middle of an occurrence of
the search string.
I suppose it's possible if you tweak the basic bisection algorithm - one idea I
had was to check for contains(substring($subject, string-length($subject) / 2 -
string-length($search), string-length($subject) / 2 - string-length($search)))
before splitting the string at string-length($subject) / 2; my assumption is
that you'll never split an occurrence of the search string if you check a
sufficiently wide string surrounding the search area. I also suppose that you'd
be able to optimize the conditions for where you'd stop searching, based on the
length of the search string.
Basically, it sounds like there are some potential improvements to be made, but
I wouldn't want to do them myself if someone else has already done them. :) I'm
also guessing that the fundamental pattern in str:replace is one that would be
optimized for tail recursion pretty quickly, so I don't know if there's even
any potential gain to be had here.
Any thoughts?
~ Scott
-----Original Message-----
From: Dimitre Novatchev [mailto:dnovatchev(_at_)gmail(_dot_)com]
Sent: Thursday, March 01, 2012 3:23 PM
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: Re: [xsl] What is the Maximum number of recurssions
On Thu, Mar 1, 2012 at 12:36 PM, Karl Stubsjoen <kstubs(_at_)gmail(_dot_)com>
wrote:
I'm just curious, what is the maximum number of times a template can
recurse?
I guess you men the number of nested recursive calls (nestedness).
There is no fixed maximum and in some cases a correctly written
template/function can have nestedness measured in the millions.
Such are the cases with tail-recursive functions/templates -- and this requires
that the XSLT processor can recognize and optimize tail recursion.
For example, this transformation run with Saxon 6.5.4 successfully prints all
numbers from 1 to one million:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="/">
<xsl:call-template name="printToN"/>
</xsl:template>
<xsl:template name="printToN">
<xsl:param name="pUpTo" select="1000000"/>
<xsl:param name="pDoneWith" select="0"/>
<xsl:if test="$pUpTo > $pDoneWith">
<xsl:value-of select="$pDoneWith+1"/>
<xsl:text>
</xsl:text>
<xsl:call-template name="printToN">
<xsl:with-param name="pUpTo" select="$pUpTo"/>
<xsl:with-param name="pDoneWith" select="$pDoneWith+1"/>
</xsl:call-template>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
However, other XSLT processors (AltovaXML) produce this error: "XSLT
instruction stack overflow"
A DVC (Divide and Conquer) style recursion is also a practical solution and
processing a sequence of N items requires a maximum nestedness of only log(N).
This recursive style is remarkable in that it doesn't require (as required in
the case of tail recursion) any special abilities from the XSLT processor and
works correctly with any XSLT processor. Thus the maximum callstack depth when
processing a sequence of one million items with DVC-style recursion, is only 19.
I would imagine it would be different from engine to engine.
Is there any other engine intelligence at play here, for example the
engine has decided that you've written and endless looping recursion,
instead of one that knows you'll eventually reach the end of your
source xml?
This is generally impossible to do, because whether or not an algorithm
finishes often depends on the specific data that is fed to it -- typical
examples are many numerical methods that converge only if the data is limited
in a specific interval or by any other condition.
Cheers,
Dimitre.
Thanks,
Karl..
--
Karl Stubsjoen
MeetScoresOnline.com
(602) 845-0006
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail:
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--
--
Cheers,
Dimitre Novatchev
---------------------------------------
Truly great madness cannot be achieved without significant intelligence.
---------------------------------------
To invent, you need a good imagination and a pile of junk
-------------------------------------
Never fight an inanimate object
-------------------------------------
To avoid situations in which you might make mistakes may be the biggest mistake
of all
------------------------------------
Quality means doing it right when no one is looking.
-------------------------------------
You've achieved success in your field when you don't know whether what you're
doing is work or play
-------------------------------------
Facts do not cease to exist because they are ignored.
-------------------------------------
I finally figured out the only reason to be alive is to enjoy it.
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--