RE: Parser implemented in XSL -- stack overflow


I am experiencing stack overflow with an XSL program and 
don't understand why. Following is a description of the 
program and some sample code.


The call:

    <xsl:apply-templates select="following-sibling::*[1]" mode="sn5">
      <xsl:with-param name="parents" select="concat($ID, ',',
$adjustedParents)" />
    </xsl:apply-templates>

will generate a new stack frame for each sibling that is processed.

A system that optimizes tail recursion can avoid this, but not many
systems do.

Saxon optimizes tail calls only for a self-recursive call-template, not
for apply-templates. There's no inherent reason for this, it's just the
way it is.

Michael Kay
Software AG
home: Michael(_dot_)H(_dot_)Kay(_at_)ntlworld(_dot_)com
work: Michael(_dot_)Kay(_at_)softwareag(_dot_)com


I have an application that generates XSL code that is 
intended to implement a "parser" (actually a Deterministic 
Finite State Transducer). This parser accepts a "flat" 
sequence of elements and transforms it into a nested 
structure according to a grammar (XSD) that is used by my app 
to generate the XSL. Obviously there are other ways to do 
this, one being simply to implement a DFST interpreter in 
Java, VB, etc. Being an XSL fan however I decided to try it in XSL.

The two problems to be solved were to process the input "left 
to right" and to implement "states" somehow. I avoided 
generating/using named templates due to past experience with 
stack overflow. So what I did was to use template modes to 
specify state and to use a select expression on my 
apply-templates calls so that only the "next" input element 
would be considered. Here's a "typical" template from my XSL program:

  <xsl:template match="node()[(_at_)Style = 'H-stage-level']" mode="sn1">
    <xsl:param name="parents" />
    <!--Push-->
    <xsl:variable name="adjustedParents" select="$parents" />
    <xsl:variable name="ID" select="generate-id()" />
    <MaturityStage Style="H-stage-level" 
ParaNumber="{(_at_)ParaNumber}" id="{$ID}" 
parent="{substring-before($adjustedParents, ',')}" />
    <xsl:apply-templates select="following-sibling::*[1]" mode="sn5">
      <xsl:with-param name="parents" select="concat($ID, ',', 
$adjustedParents)" />
    </xsl:apply-templates>
  </xsl:template>


So this template matches an input element that has the 
indicated Style attribute. In DFST terms this template 
corresponds to the action to take if you encounter an 
H-stage-level input while in state "sn1". That action is to 
emit a MaturityStage element and transition to state "sn5". 
Note the use of following-sibling::[1] to restrict the match 
to the next element in the input. The "parents" paramater is 
a csv string used like a stack to control creation of "parent 
pointers" in the emitted elements.

All of the templates are of this general form, although some 
of them do a "pop" on the csv; e.g., 

  <xsl:variable name="adjustedParents" 
select="substring-after($parents, ',')" />

Again, there are NO call-template elements in the XSL, only 
xsl:apply-templates calls of the general form above. It is 
also the case that all such apply-template calls are tail recursive.

I have experienced stack overflow with MSXML4, with the MS 
.Net XSL engine and with Saxon6.5.2 (which I believe 
implements proper tail recursion.) The stack growth appears 
to be proportional to the XML input size; i.e., for files up 
to a "certain size" there is no overflow and the 
transformation works as expected.

Can anyone explain to me why this "style" of XSL should cause 
stack growth proportional to input size? Obviously it is not 
generally the case that stack growth is proportional to input 
size -- or XSL wouldn't be very interesting!

Thanks in advance,
 Bill Cohagan


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list



 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list