Re: Re: How to output open/close tags independently?

> The "good" approach took hours; the "bad" approach took minutes.

There is another option. For grins, I ran your example, with 22,000 'x'elements through DataPower's XA35 XML accelerator using apachebench.Here were the results I got.

For the "good" (XSLT-correct) approach: total time ~129ms
For the "bad" approach: ~139ms

If this piques your interest, please contact us at info(_at_)datapower(_dot_)com orsee our website at http://www.datapower.com/products/xa35.html.

Note that these numbers include parsing time, network transport time,and actual xslt processing time.

Since I couldn't find many actual concrete stylesheets, I wrote up twothat seemed to follow the techniques being discussed. I've includedthem here for reference. I tested them with Xalan and they seemed toexhibit the same slowdown being discussed.


The "correct" version:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
               version="1.0">


 <xsl:output method="xml"/>
 <xsl:template match="/">
   <xsl:for-each select="top/x[position() mod 3 = 1]">
     <xsl:text>&#xA;</xsl:text>
     <w>
       <xsl:copy-of select=".|following-sibling::x[position() &lt; 3]"/>
     </w>
   </xsl:for-each>
 </xsl:template>

</xsl:stylesheet>

The "d-o-e" version:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
               version="1.0">


 <xsl:output method="xml"/>

 <xsl:template match="top">
   <xsl:for-each select="x">
     <xsl:if test="position() mod 3 = 0">
       <xsl:value-of disable-output-escaping="yes" select="'&lt;w&gt;'"/>
     </xsl:if>
     <xsl:copy-of select="."/>
     <xsl:if test="position() mod 3 = 2">
       <xsl:value-of disable-output-escaping="yes" select="'&lt;/w&gt;'"/>
     </xsl:if>
   </xsl:for-each>
   <xsl:if test="count(x) mod 3 != 2">

<xsl:value-of disable-output-escaping="yes"select="'</w>'"/></xsl:if>

 </xsl:template>

</xsl:stylesheet>

Niko Matsakis
DataPower Technology

Edward L. Knoll wrote:

Not that I'm picking on you specifically Wendell, but your reply was the
most blatantly representative of a class of responses of a particularly
XSL purist/snobbish nature which I find extremely objectionable.  There
was a reply from a Mitch Amiano which actually supplied a suggested
approach which "appeared" entirely reasonable, so tried it out.  I've
included the core XSL for both approaches below: the "bad" code which
had the 'disable-output-escaping' clause and the "good" code which
generated the Page element directly.  Following are my performance
numbers on a test input file which had 22,004 Row elements and was
13,425,501 bytes large (the time output is from the Unix time(1)
command):

For the "good" (XSLT-correct) approach:
 real  2:41:32.6
 user  2:31:57.0
 sys         1.9

For the "bad" (d-o-e) approach:
 real     1:38.4
 user     1:31.8
 sys         1.0

The "good" approach took hours; the "bad" approach took minutes.  For
those that will care, the test environment was a Sun Solaris platform
using the interim release of the Xalan C++ 1.4 XSLT processor.

I'm just curious, do those of you with this hard-line "purist" attitude
actually use XSL to do real work or are you mostly academics and tool
developers/vendors?  I understand staying true to a paradigm up to a
point, but sooner or later "the rubber has to hit the road".
Regards,
Ed Knoll

p.s. This is not all of the XSL, just the differences.


---- "Good" XSL ------------------------------

<xsl:variable name='PageFirstRows'
       select='/gnsl:Results/gnsl:Table/gnsl:Row[
                                       (position() mod $RowsPerPage) =
1]' />

<xsl:template match="gnsl:Table">
  <xsl:copy>
     <xsl:copy-of select="@*" />
     <xsl:apply-templates select="gnsl:Columns" />

     <xsl:choose>
        <xsl:when test="gnsl:Row">
           <xsl:apply-templates select='$PageFirstRows' />
        </xsl:when>
        <xsl:otherwise>
           <xsl:element name='Page' />
        </xsl:otherwise>
     </xsl:choose>
  </xsl:copy>
</xsl:template>

<xsl:template match="gnsl:Row">
  <xsl:element name="Page">
     <xsl:for-each
          select='.|following-sibling::gnsl:Row[$RowsPerPage >
position()]'>
        <xsl:call-template name='CopyAll' />
     </xsl:for-each>
  </xsl:element>


---- "Bad" XSL ------------------------------

<xsl:template match="gnsl:Table">
  <xsl:copy>
     <xsl:copy-of select="@*" />
     <xsl:apply-templates select="gnsl:Columns" />
     <xsl:choose>
        <xsl:when test="gnsl:Row">
           <xsl:apply-templates select="gnsl:Row" />
        </xsl:when>
        <xsl:otherwise>
           <xsl:value-of select="$LineBreak" />
           <xsl:text
disable-output-escaping="yes">&lt;Page/&gt;</xsl:text>
           <xsl:value-of select="$LineBreak" />
        </xsl:otherwise>
     </xsl:choose>
  </xsl:copy>
</xsl:template>

<xsl:template match="gnsl:Row">
  <xsl:if test="(position() mod $RowsPerPage) = 1">
     <xsl:if test="position() != 1">
        <xsl:value-of select="$LineBreak" />
        <xsl:text
disable-output-escaping="yes">&lt;/Page&gt;</xsl:text>
        <xsl:value-of select="$LineBreak" />
     </xsl:if>
     <xsl:value-of select="$LineBreak" />
     <xsl:text disable-output-escaping="yes">&lt;Page&gt;</xsl:text>
     <xsl:value-of select="$LineBreak" />
  </xsl:if>

  <xsl:call-template name="CopyAll" />

  <xsl:if test="position() = last()">
     <xsl:value-of select="$LineBreak" />
     <xsl:text disable-output-escaping="yes">&lt;/Page&gt;</xsl:text>
     <xsl:value-of select="$LineBreak" />
  </xsl:if>
</xsl:template>
Hey Mitch,
The horribleness of disable-output-escaping is not (to my mind) really anissue of the well-formedness constraint either in the stylesheet or in theoutput -- that's something of a red herring (though it is a risk and a signof the deeper problem). Rather, it's the violation of XSLT's processingmodel, in which the transformation of the node tree and thepost-transformation serialization are clearly distinguished and keptseparate by design. *Any* solution that works by writing markup to outputusing d-o-e creates a dependency on the serialization step. While this maybe acceptable in certain circumstances (e.g. writing SGML entity referencesto output that are not otherwise provided for, when you *know* you're goingto write a file), it's horrible at other times, if only because thedesigner has created this dependency unwittingly, and thus doesn'tunderstand why the transform breaks in a conformant architecture, likeMozilla or transformation chains in Cocoon, where no file is gettingserialized.
The relevance of grouping is only that the "write markup" approach isusually resorted to by newer XSLT programmers who don't know how else to dogrouping, and who fall back on their Perl or Javascript experience (or justsheer ingenuity) to suppose that writing markup is the best or onlysolution to the problem (it is neither).
I doubt that any experienced XSLTer would have a problem with either of thesolutions you offered (or Dimitre's, or Tom's), since none of themintroduce the dependency on serialization that is the problem withd-o-e-based techniques for "outputting open/close tags independently".There the distinctions are much more of coding style and performance; butnone of them use a technique that is prone to break the minute you moveyour stylesheet into a different environment.
Cheers,
Wendell




XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list