xsl-list
[Top] [All Lists]

Re: [xsl] Performance improvement for a recursive function?

2011-12-13 08:32:54
Hi Michael,

On 13/12/2011, Michael Kay <mike(_at_)saxonica(_dot_)com> wrote:
to process 41337 path commands it takes 861 sec! The first 5000 path
commands are processed in 14 sec (at 355.87 per sec) the last 5000
take 671 sec (only at 7.5 per sec).
I'm seeing an execution time of about 90 seconds under Saxon-EE
9.3.0.11, down to 23 seconds under Saxon-EE 9.4.0.1
What baffles me is that with each tail-recursion it takes longer to
execute the function! Do you see this also with 9.4? Any idea about
the possible reason?

(It would be a good idea to declare the type of the parameter, both for 
documentation and
for performance, though I have no idea whether this makes a significant 
contribution -
probably not.)
Done as="xs:decimal". Currently I use 400

concat($xy[$i-xy], ',', $xy[$i-xy+1])
(I do find that hyphen confusing...)
$i-c, $i-xy agreed. Would $i_c or $ic or an other name be better? I
only trying to make obvious the variable is used as an index to the
sequence $c or $xy.

Regards,
Manfred

On 13/12/2011, Michael Kay <mike(_at_)saxonica(_dot_)com> wrote:

I'm seeing an execution time of about 90 seconds under Saxon-EE
9.3.0.11, down to 23 seconds under Saxon-EE 9.4.0.1 (released on Friday
in case you missed it). I've no idea why both numbers are much better
than yours, but I've been investigating the difference between them.
Byte-code generation in 9.4 doesn't account for the difference, it only
shaves off a couple of seconds.

The Java profile for 9.3 and 9.4 is very different: in 9.4 it's
dominated by regex handling and string concatenation (which is why
bytecode generation doesn't make much difference - it's the same regex
library either way). In 9.3 it's dominated by evaluation of a filter
expression. It seems that the difference is in the optimization of the
expression

concat($xy[$i-xy], ',', $xy[$i-xy+1])

(I do find that hyphen confusing...)

where 9.4 has turned both filter expressions into integer subscript
operations (saxon:item-at), while 9.3 is evaluating the third argument
as a filter expression (testing each item in $xy in turn against a
boolean predicate). Replacing the subscript $i-xy+1 by an integer
variable would probably fix this. (Alternatively, move to 9.4)

Michael Kay
Saxonica


On 13/12/2011 00:07, Manfred Staudinger wrote:
Hi,

I need some help to improve the performance of a function, as I have
run out of ideas what to change!

In the process to convert some data to SVG I transform (or even break
up) @Data attribute of the Path element and then use the function
my:compose-path-data [1] to put the parts together into a string
again. The attribute string is a succession of path commands, where
each path command  (see the global variable below) consists of a one
byte character followed by zero, one, or two integers. Having about
2200 Path elements, most of them have a @Data attribute with between 5
an 50 path commands.

A few attribute strings are bigger (up to 1MB) and contain up to
100000 path commands. As the function is called for each path command
it is tail recursive, but the performance is still a big problem: to
process 41337 path commands it takes 861 sec! The first 5000 path
commands are processed in 14 sec (at 355.87 per sec) the last 5000
take 671 sec (only at 7.5 per sec).

What Saxon reports [2] is consistent with the above. In case you want
to try it, I have uploaded the specific test case here:
    http:///test.rudolphina.org/test01-perform-data.xsl
    http:///test.rudolphina.org/test01-perform-data-org.xml

Using Saxon-HE 9.3.0.5J and Java version 1.6.0_14 on Windows® XP Home.
Hardware ASUS Eee PC 1000H with Intel® Atom N270 (1.60 GHz)

Regards,
Manfred

[1]
<xsl:function name="my:compose-path-data" as="xs:string*">
    <xsl:param name="c" as="xs:integer*"/>
    <xsl:param name="i-c" as="xs:integer"/>
    <xsl:param name="xy" as="xs:integer*"/>
    <xsl:param name="i-xy" as="xs:integer"/>
    <xsl:param name="i-end" as="xs:integer"/>
    <xsl:param name="result" as="xs:string?"/>
    <xsl:variable name="cmd" select="key('path', $c[$i-c], $path-cmd)"
as="element()?"/>
    <xsl:sequence select="if ($i-c gt $i-end)
       then $result
       else my:compose-path-data($c, $i-c + 1,
          $xy, $i-xy + xs:integer($cmd/n),
          $i-end,
          concat($result, $cmd/@char, if ($cmd/n=2)
             then (: M, m, l, (blank) :) concat($xy[$i-xy], ',',
$xy[$i-xy+1])
             else  if ($cmd/n=1)
                then (: h, v :) string($xy[$i-xy])
                else (: z :) ())
          )"/>
</xsl:function>

and on the stylesheet level
<xsl:variable name="path-cmd">
    <path>
       <cmd char=" "><n>2</n><code>32</code></cmd>
       <cmd char="M"><n>2</n><code>77</code></cmd>
       <cmd char="l"><n>2</n><code>108</code></cmd>
       <cmd char="m"><n>2</n><code>109</code></cmd>
       <cmd char="h"><n>1</n><code>104</code></cmd>
       <cmd char="v"><n>1</n><code>118</code></cmd>
       <cmd char="z"><n>0</n><code>122</code></cmd>
    </path>
</xsl:variable>
<xsl:key name="path" match="cmd" use="xs:integer(code)"/><!-- $path-cmd
-->

[2]
Stylesheet compilation time: 4218 milliseconds
Using parser org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser
net.sf.saxon.tree.tiny.TinyBuilder
Tree built in 1375 milliseconds
Tree size: 4 nodes, 0 characters, 11 attributes
Execution time: 14m 31.579s (871579ms)
Memory used: 36559128
NamePool contents: 33 entries in 33 chains. 8 prefixes, 8 URIs

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or 
e-mail:<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--




--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--



--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--