xsl-list
[Top] [All Lists]

Re: [xsl] improving performance in creating ids

2019-04-24 16:29:21
Pieter,

That is excellent.

However, I haven't given up yet on xsl:number/@from -- not saying I'll
explain it or make it work, but unless I miss something (not
impossible), it *should* work the way we want and if it doesn't, there
must be something about it, or the problem, we aren't seeing. (Or a
bug in the processor?)

After all, a use case such as you have described is what this syntax
is clearly meant to address.

The news that a counting-based solution is not much better with a key,
than without it, is interesting, but possibly due to Saxon
optimizations (processor?) ... which suggests that some processors
might *really* take their sweet time with a raw XPath counting-based
solution....

Cheers, Wendell






On Wed, Apr 24, 2019 at 10:23 AM Pieter Lamers
pieter(_dot_)lamers(_at_)benjamins(_dot_)nl 
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com>
wrote:

Hi all,

In the end I found the solution for my original numbering plan in this
xsl:number expression:

<xsl:number level="any" count="*[. &gt;&gt; $ancestor-with-id][@rid]"/>

the '>>' operator performs well enough (total processing time for the
test book now 5 seconds) and was brought to my kind attention by Erik
Siegel. Thanks for all your help.

Best,
Pieter

On 24/04/2019 07:46, Pieter Lamers 
pieter(_dot_)lamers(_at_)benjamins(_dot_)nl wrote:
Hi Wendell,

Had not seen your subsequent replies before I signed off last night.
Your solution below involves a count which brings back my original
performance problem. I think I will change my requirement for
"locally" numbered ids somewhat so I can profit most from xsl:number.
still, sad that 'from' cannot serve my purpose (or so it seems).

Hi Liam,

You are probably right that indexing + keys should work in the xquery
solution. I'd have to dive a little further into that area before I
can put it to use; my initial efforts did not make a change.

Thanks and all the best,
Pieter

On 23/04/2019 23:47, Wendell Piez wapiez(_at_)wendellpiez(_dot_)com wrote:
Okay this is my next shot --

<xsl:value-of select="ancestor::*[exists(@id)][1]/@id || '-' ||
local-name() ||
count(
key('elems-by-name',local-name(),ancestor::*[exists(@id)][1])[current()
.] ) + 1"/>
but after having done that I'd probably go back to xsl:number.

Partly since it's probably as fast, but mainly because declarative
syntax rules.

(Note: still untested. Use at your own risk!)

Cheers, Wendell


On Tue, Apr 23, 2019 at 5:40 PM Wendell Piez 
wapiez(_at_)wendellpiez(_dot_)com
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:
Oops, hit button too soon -- you'll see the error there.

I leave scoping the correct count as an exercise, but it's in there
somewhere! :-)

Cheers, Wendell

On Tue, Apr 23, 2019 at 5:39 PM Wendell Piez
<wapiez(_at_)wendellpiez(_dot_)com> wrote:
Hi again,

Also note if we had a key we would need no variable --

<xsl:value-of select="local-name() || '-'"/>
<xsl:number level="any" from="*[@id]"
count="key('elem-by-name',local-name())"/>

which suggests we could also use the third argument of key() ...

<xsl:value-of select="local-name() || '-' ||
count(key('elems-by-name',local-name(),ancestor::*[exists(@id)][1]))"/>


still not tested -- but ought to work, syntax errors aside --

Cheers, Wendell

On Tue, Apr 23, 2019 at 5:31 PM Wendell Piez
<wapiez(_at_)wendellpiez(_dot_)com> wrote:
Hey Pieter,

If performance were the issue, I might try factoring out the ID
labeling into a completely separate pass, in order (for example) to
implement it as a sibling traversal, passing parameters forward to
increment the ID values. (If your numbering is fancy, for example
scoping the increment to the element type as well as the ancestor,
you
might have to pass a map forward.) I think that ought to be pretty
fast, plus it separates this logic from the other logic of the XSLT.
It's essentially like treating the XSLT engine like an overpowered
SAX
parser. (Not that I would know how to make one of those.)

But this is only if xsl:number wasn't doing it, after I tried
something like what Martin H shows with plain old templates.

<xsl:variable name="ilk" select="local-name()"/>
<xsl:value-of select="$ilk || '-'"/>
<xsl:number level="any" from="*[@id]" count="*[local-name() eq
$ilk]"/>

-- untested --

Cheers, Wendell

On Tue, Apr 23, 2019 at 10:57 AM Martin Honnen 
martin(_dot_)honnen(_at_)gmx(_dot_)de
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:
On 23.04.2019 16:28, Pieter Lamers 
pieter(_dot_)lamers(_at_)benjamins(_dot_)nl wrote:

Thanks for your quick reply. the node identity comparison helped
quite a
bit, although I am still around a minute for a full book of ids.
I am
not sure how xsl:number would help here, and what kind of
performance
win it would give over count(). I tried something with a nested
transformation, but what should I feed it?

      <xsl:number select="*[last()]"/>
works (given a set of preceding nodes) but it is slightly slower
than a
count() in the xquery. Maybe I should be using xsl:number
differently?

It is difficult for me to suggest that without knowing the XML input
structure and whether you want to generate that id based on a
count or
numbering only for certain nodes or some particular element type. In
general if I wanted to delegate counting to xsl:number similar to
your
function I would define a template in a mode for that e.g.

     <xsl:template match="*" mode="number">
        <xsl:number level="any" from="*[@id]"/>
     </xsl:template>

and then, where you need that number, you would use e.g.

     <xsl:apply-templates select="." mode="number"/>

Both the template or the or the select of the apply-templates can of
course be adapted to more particular needs.

As for being more efficient that using count, that then depends
on the
implementation but I would think there is some optimization to be
expected in an XSLT processor for xsl:number.



--
...Wendell Piez... ...wendell -at- nist -dot- gov...
...wendellpiez.com... ...pellucidliterature.org...
...pausepress.org...
...github.com/wendellpiez... ...gitlab.coko.foundation/wendell...


--
...Wendell Piez... ...wendell -at- nist -dot- gov...
...wendellpiez.com... ...pellucidliterature.org...
...pausepress.org...
...github.com/wendellpiez... ...gitlab.coko.foundation/wendell...


--
...Wendell Piez... ...wendell -at- nist -dot- gov...
...wendellpiez.com... ...pellucidliterature.org... ...pausepress.org...
...github.com/wendellpiez... ...gitlab.coko.foundation/wendell...



--
Pieter Lamers
John Benjamins Publishing Company
Postal Address: P.O. Box 36224, 1020 ME AMSTERDAM, The Netherlands
Visiting Address: Klaprozenweg 75G, 1033 NN AMSTERDAM, The Netherlands
Warehouse: Kelvinstraat 11-13, 1446 TK PURMEREND, The Netherlands
tel: +31 20 630 4747
web: www.benjamins.com




-- 
...Wendell Piez... ...wendell -at- nist -dot- gov...
...wendellpiez.com... ...pellucidliterature.org... ...pausepress.org...
...github.com/wendellpiez... ...gitlab.coko.foundation/wendell...
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--

<Prev in Thread] Current Thread [Next in Thread>