xsl-list
[Top] [All Lists]

Re: [xsl] XSL performance question: running count of attributes using axes and sum()

2009-04-09 15:33:34
(retrying send due to problems; apologies if this is a duplicate)

At 2009-04-09 11:22 -0700, mark bordelon wrote:
In transforming the <syl> tags below into HTML table cells to display them, I need to format each cell with a green color with the running total of the @length attributes is a multiple of four. Ideally having the ability to do running totals in another variable would be great, but not the best XSL-esque solution, so I am using axes instead.

But I think your axis approach could be improved.

I have tried solutions with count and sum, but performance is slow: 756 lines like the ones below mean thousands of syllables to check, each with its own axis computation -- the complete xform takes more than an hour!

Can anyone point me to a solution that is more performant yet still elegant/simple?

An aside: it seems that ceiling() is an Xpath1.0 function, but oddly enough not floor()

No, you are writing an XPath 2.0 expression that violates XPath 1.0 syntax ... floor() is supported in XPath 1.0:

  http://www.w3.org/TR/1999/REC-xpath-19991116#function-floor

You have written:

   node-set-expression/floor(attribute)

which is acceptable in XPath 2.0 but not in XPath 1.0 where you need to write:

   floor(node-set-expression/attribute)

-- Altova SPY complains about floor until I change the stylesheet to version 2.0 (sigh).

Yes, because of your expression, not because of the function.

Now, you are using that inside of a sum() expression ... which means you cannot use XPath 1.0 because you are applying the floor() function to each argument within the sum().

I would love this to transform in XSL1.0 if possible, and rounding down each length to the integer is essential to acheive the correct formatting result.

Then it is going to be an elaborate solution because the argument to sum() in XPath 1.0 can only be a node set, or a variable, not the result of applying a function to each member of the node set.

I'll continue with XPath 2.0.

Thanks in advance for any help on this.

The preceding axis has syllables all the way to the start of the document ... I suspect you need only work with siblings to get the performance you need.

XML:

<poem>
...
</poem>

XSL template:

<xsl:template match="syl">

<xsl:variable name="line_id"><xsl:value-of select="node()/ancestor::line/@id" /></xsl:variable>

It also wastes time to assign variables and use them, though the processor may optimize this.

<xsl:variable name="current_quantity">

Assigning text values to temporary trees and converting them later to numbers is very (very!) inefficient. You should just assign the value to the variable.

<xsl:value-of select="sum(preceding::syl[ancestor::line/@id = $line_id and (not(@elide) or @elide='false') ]/floor(@length))" /></xsl:variable>

<xsl:variable name="current_quantity" select="expression"/>

The preceding axis is notoriously slow because you will be going to *all* syllables all the way to the start of the document.

Did you mean to have the <syl> for "que" between words as in your first line? That looks out of place and it really adds a lot to the expressions.

If not:

sum( ( preceding-sibling::syl | ../preceding-sibling::word/syl )
     [ not( @elide = 'true' ) ]/
     floor(@length) )

If so:

sum( ( preceding-sibling::syl | ../preceding-sibling::syl |
       preceding-sibling::word/syl | ../preceding-sibling::word/syl )
     [ not( @elide = 'true' ) ]/
     floor(@length) )

Note that I've replaced " not(@elide) or @elide='false' " with "not( @elide='true' )" because @elide='true' will be false if there is no @elide attribute, so not(@elide='true') will be true if there is no @elide attribute or if @elide='false'. I am assuming there are no other values for @elide.

Since I'm using XSLT 2.0, that could also be done as:

sum( ( ancestor::line//syl[. << current()] )
     [ not( @elide = 'true' ) ]/
     floor(@length) )

... but I can't comment on performance and would be interested to hear what you experience with your large data set.

<xsl:variable name="color"><xsl:choose><xsl:when test="@length=2 and ($current_quantity mod 4 = 0)">background-color:#EEFFEE;</xsl:when></xsl:choose></xsl:variable>

<td style="{$color}"><xsl:value-of select="text()" /></td>

I think it is a bad habit to address text nodes explicitly, and I think you should be using "." instead of "text()". Not everyone feels that way.

</xsl:template>

You didn't post a working stylesheet, so it took time to rewrite your code for illustration. I've run the rewritten one below and then suggested what I think would be faster performing.

I hope this helps. It would take too long to volunteer to write the recursive loop of floor() to each argument before the sum to show this in XSLT 1.0.

. . . . . . . . . . . Ken


t:\ftemp>type bordelon.xml
<poem>
        <line id="1">
                <word id="1">
                        <syl length="2">Ar</syl>
                        <syl length="1">ma</syl>
                </word>
                <word id="2">
                        <syl length="1">vi</syl>
                        <syl length="2">rum</syl>
                </word>
                <syl length="1">que</syl>
                <word id="3">
                        <syl length="1">ca</syl>
                        <syl length="2">no</syl>
                </word> ,
                <word id="4">
                        <syl length="2">Tro</syl>
                        <syl length="2">iae</syl>
                </word>
                <word id="5">
                        <syl length="2">qui</syl>
                </word>
                <word id="6">
                        <syl length="2">pri</syl>
                        <syl length="1">mus</syl>
                </word>
                <word id="7">
                        <syl length="1">ab</syl>
                </word>
                <word id="8">
                        <syl length="2">o</syl>
                        <syl length="2">ris</syl>
                </word>
        </line>
        <line id="2">
                <word>
                        <syl length="2">li</syl>
                        <syl length="1.5">to</syl>
                        <syl length="1">ra</syl>
                </word> ,
                <word id="15">
                        <syl length="2">mul</syl>
                        <syl elide="true" length="1">tum</syl>
                </word>
                <word id="16">
                        <syl length="2">il</syl>
                        <syl elide="true" length="1">le</syl>
                </word>
                <word id="17">
                        <syl length="2">et</syl>
                </word>
                <word id="18">
                        <syl length="2">ter</syl>
                        <syl length="2">ris</syl>
                </word>
                <word id="19">
                        <syl length="2">iac</syl>
                        <syl length="2">ta</syl>
                        <syl length="1">tus</syl>
                </word>
                <word id="20">
                        <syl length="1">et</syl>
                </word>
                <word id="21">
                        <syl length="2">al</syl>
                        <syl length="2">to</syl>
                </word>
        </line>
</poem>


t:\ftemp>call xslt2 bordelon.xml bordelon.xsl

SYL: Ar SUM: 0 COLOR: GREEN
SYL: ma SUM: 2 COLOR:
SYL: vi SUM: 3 COLOR:
SYL: rum SUM: 4 COLOR: GREEN
SYL: que SUM: 6 COLOR:
SYL: ca SUM: 7 COLOR:
SYL: no SUM: 8 COLOR: GREEN
SYL: Tro SUM: 10 COLOR:
SYL: iae SUM: 12 COLOR: GREEN
SYL: qui SUM: 14 COLOR:
SYL: pri SUM: 16 COLOR: GREEN
SYL: mus SUM: 18 COLOR:
SYL: ab SUM: 19 COLOR:
SYL: o SUM: 20 COLOR: GREEN
SYL: ris SUM: 22 COLOR:
SYL: li SUM: 0 COLOR: GREEN
SYL: to SUM: 2 COLOR:
SYL: ra SUM: 3 COLOR:
SYL: mul SUM: 4 COLOR: GREEN
SYL: tum SUM: 6 COLOR:
SYL: il SUM: 6 COLOR:
SYL: le SUM: 8 COLOR:
SYL: et SUM: 8 COLOR: GREEN
SYL: ter SUM: 10 COLOR:
SYL: ris SUM: 12 COLOR: GREEN
SYL: iac SUM: 14 COLOR:
SYL: ta SUM: 16 COLOR: GREEN
SYL: tus SUM: 18 COLOR:
SYL: et SUM: 19 COLOR:
SYL: al SUM: 20 COLOR: GREEN
SYL: to SUM: 22 COLOR:
t:\ftemp>call xslt2 bordelon.xml bordelon-new.xsl

SYL: Ar SUM: 0 COLOR: GREEN
SYL: ma SUM: 2 COLOR:
SYL: vi SUM: 3 COLOR:
SYL: rum SUM: 4 COLOR: GREEN
SYL: que SUM: 6 COLOR:
SYL: ca SUM: 7 COLOR:
SYL: no SUM: 8 COLOR: GREEN
SYL: Tro SUM: 10 COLOR:
SYL: iae SUM: 12 COLOR: GREEN
SYL: qui SUM: 14 COLOR:
SYL: pri SUM: 16 COLOR: GREEN
SYL: mus SUM: 18 COLOR:
SYL: ab SUM: 19 COLOR:
SYL: o SUM: 20 COLOR: GREEN
SYL: ris SUM: 22 COLOR:
SYL: li SUM: 0 COLOR: GREEN
SYL: to SUM: 2 COLOR:
SYL: ra SUM: 3 COLOR:
SYL: mul SUM: 4 COLOR: GREEN
SYL: tum SUM: 6 COLOR:
SYL: il SUM: 6 COLOR:
SYL: le SUM: 8 COLOR:
SYL: et SUM: 8 COLOR: GREEN
SYL: ter SUM: 10 COLOR:
SYL: ris SUM: 12 COLOR: GREEN
SYL: iac SUM: 14 COLOR:
SYL: ta SUM: 16 COLOR: GREEN
SYL: tus SUM: 18 COLOR:
SYL: et SUM: 19 COLOR:
SYL: al SUM: 20 COLOR: GREEN
SYL: to SUM: 22 COLOR:
t:\ftemp>type bordelon.xsl
<?xml version="1.0" encoding="US-ASCII"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
                version="2.0">

<xsl:output method="text"/>

<xsl:template match="syl">
<xsl:variable name="line_id"><xsl:value-of select="node()/ancestor::line/@id" /></xsl:variable>

<xsl:variable name="current_quantity"><xsl:value-of select="sum(preceding::syl[ancestor::line/@id = $line_id and (not(@elide) or @elide='false') ]/floor(@length))" /></xsl:variable>

<xsl:variable name="color"><xsl:choose><xsl:when test="@length=2 and ($current_quantity mod 4 = 0)">GREEN</xsl:when></xsl:choose></xsl:variable>

<xsl:text/>
SYL: <xsl:value-of select="."/> SUM: <xsl:value-of select="$current_quantity"/> COLOR: <xsl:value-of select="$color"/>

</xsl:template>

<xsl:template match="text()"/>

</xsl:stylesheet>

t:\ftemp>type bordelon-new.xsl
<?xml version="1.0" encoding="US-ASCII"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
                version="2.0">

<xsl:output method="text"/>

<xsl:template match="syl">

<xsl:variable name="current_quantity"
              select="sum( ( preceding-sibling::syl |
                             ../preceding-sibling::syl |
                             preceding-sibling::word/syl |
                             ../preceding-sibling::word/syl )
                           [ not( @elide = 'true' ) ]/
                           floor(@length) )"/>

<xsl:variable name="color"
              select="if( @length=2 and
$current_quantity mod 4 = 0 ) then 'GREEN' else ''"/>

<xsl:text/>
SYL: <xsl:value-of select="."/> SUM: <xsl:value-of select="$current_quantity"/> COLOR: <xsl:value-of select="$color"/>

</xsl:template>

<xsl:template match="text()"/>

</xsl:stylesheet>

t:\ftemp>type bordelon-new2.xsl
<?xml version="1.0" encoding="US-ASCII"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
                version="2.0">

<xsl:output method="text"/>

<xsl:template match="syl">

<xsl:variable name="current_quantity"
              select="sum( ( ancestor::line//syl[. &lt;&lt; current()]  )
                           [ not( @elide = 'true' ) ]/
                           floor(@length) )"/>

<xsl:variable name="color"
              select="if( @length=2 and
$current_quantity mod 4 = 0 ) then 'GREEN' else ''"/>

<xsl:text/>
SYL: <xsl:value-of select="."/> SUM: <xsl:value-of select="$current_quantity"/> COLOR: <xsl:value-of select="$color"/>

</xsl:template>

<xsl:template match="text()"/>

</xsl:stylesheet>

t:\ftemp>rem Done!


--
XSLT/XSL-FO/XQuery training in Los Angeles (New dates!) 2009-06-08
Training tools: Comprehensive interactive XSLT/XPath 1.0/2.0 video
Video lesson:    http://www.youtube.com/watch?v=PrNjJCh7Ppg&fmt=18
Video overview:  http://www.youtube.com/watch?v=VTiodiij6gE&fmt=18
G. Ken Holman                 mailto:gkholman(_at_)CraneSoftwrights(_dot_)com
Crane Softwrights Ltd.          http://www.CraneSoftwrights.com/s/
Male Cancer Awareness Nov'07  http://www.CraneSoftwrights.com/s/bc
Legal business disclaimers:  http://www.CraneSoftwrights.com/legal


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--

<Prev in Thread] Current Thread [Next in Thread>