Just to followup on the several ideas proposed for
my sequential navigation issue.
To reduce the processing time (from 3 hours for 15MB
when using the catch-all expressions below), I have
modified my chunker stylesheet in the following way:
- I call a "preprocessing" template that simply scans
the source document and creates another document
(prevnext_index.xml) containing a nodeset of all
nodes I am interested in, *in*document*order*.
This gives me the following document:
<bv:nodes>
<bv:node ID="ACTR.00.B36A178001BE2A74" GI="PART">Part A -
Classification and Surveys</bv:node>
<bv:node ID="ACTR.01.2B02E5C001BE13AE" GI="CHAP">Chapter 1 - Principles
of classification and class notations</bv:node>
<bv:node ID="ACTM.03.9FCE36C001BDFCD4" GI="SECT">Section 1 - General
Principles of Classification</bv:node>
[ ... and another 1800 lines or so ... ]
</bv:nodes>
- Later on, in the main part of the stylesheet, when
chunking a fragment, I retrieve the previous and next
information recorded in the index file using this bit:
...
<xsl:variable name='index.document'
select='document(concat("file:/", $base.dir, "/",
$index.file.name))'/>
<xsl:param name="previous.fragment"
select="$index.document/bv:nodes/bv:node[(_at_)ID = current
()/@ID]/preceding-sibling::bv:node[1]"/>
<xsl:param name="next.fragment"
select="$index.document/bv:nodes/bv:node[(_at_)ID = current
()/@ID]/following-sibling::bv:node[1]"/>
...
- This cuts processing time to only 25 minutes, plus 1
and a half minutes for creating the index document.
Not too bad for my needs. But is it "elegant" ;-)?
Thanks to everybody for your help!
Cheers,
Jakob.
"Michael Kay"
<michael(_dot_)h(_dot_)kay(_at_)ntlworld(_dot_)com>@lists.mulberrytech.com on
12/06/2002 10:39:55 AM
Veuillez répondre à xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Envoyé par : owner-xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Pour : <xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com>
cc :
Objet : RE: [xsl] sequential navigation problem (long)
Ref. Message:
Jeni, David,
Thanks for your responses. Thanks for pointing out the error
in my logic.
I am currently using the catch-all expressions, which works ok,
(ancestor::* | preceding::*)
[self::PART or self::CHAP or self::SECT or self::ART or
self::SYMBOLS or self::APPENDIX or self::SART][(_at_)ID][last()] and
(descendant::* | following::*)
[self::PART or self::CHAP or self::SECT or self::ART or
self::SYMBOLS or self::APPENDIX or self::SART][(_at_)ID][1]
Most decent XSLT processors are likely to optimize the [1] predicate by
stopping the search when the first node has been found; but optimizing
[last()] is much more difficult.
My first thought was to rewrite this as:
(ancestor::*[self::part(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)][(_at_)ID][1]
or
preceding::*[self::part(_dot_)(_dot_)(_dot_)(_dot_)][(_at_)ID][1])
and
(descendant::*[(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)][(_at_)ID][1]
or
following::*[(_dot_)(_dot_)(_dot_)(_dot_)(_dot_)][(_at_)ID][1])
and see if this speeds it up.
But on reflection, the [1] and [last()] predicates are redundant in a
boolean context: if there are any nodes selected, there will be a first
node and a last node. So it might be enough simply to get rid of the [1]
and [last()] predicates in your expression.
Rearranging the two branches of the "or" might also help: the biggest
cost will come when the first branch does a long search and finds
nothing. The above is probably best, but I don't know your data.
Now I'll go away and tweak the Saxon optimizer so it gets rid of a
trailing [last()] predicate on a path expression used in a boolean
context...
Michael Kay
Software AG
home: Michael(_dot_)H(_dot_)Kay(_at_)ntlworld(_dot_)com
work: Michael(_dot_)Kay(_at_)softwareag(_dot_)com
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list