As for performance, I compared the execution times of the two solutions
(the index-of vs fold-left / intersect / if-then-else).
The Xml document was : "<t><a/><b/><c/></t>".
The $nodes sequence contained 45 nodes:
$nodes := ($xml/*/a, $xml/*/c, $xml/*/b, $xml/*/a, $xml/*/b, $xml/*/a,
$xml/*/c, $xml/*/b, $xml/*/a, $xml/*/b, $xml/*/a, $xml/*/c, $xml/*/b,
$xml/*/a, $xml/*/b, $xml/*/a, $xml/*/c, $xml/*/b, $xml/*/a, $xml/*/b,
$xml/*/a, $xml/*/c, $xml/*/b, $xml/*/a, $xml/*/b, $xml/*/a, $xml/*/c,
$xml/*/b, $xml/*/a, $xml/*/b,$xml/*/a, $xml/*/c, $xml/*/b, $xml/*/a,
$xml/*/b, $xml/*/a, $xml/*/c, $xml/*/b, $xml/*/a, $xml/*/b, $xml/*/a,
$xml/*/c, $xml/*/b, $xml/*/a, $xml/*/b )
Separately I timed only the time it takes for executing parse-xml() and
constructing the node sequence. All this was done with BaseX.
Results:
Parsing the Xml document and constructing the sequence: 0.10ms
Evaluating the "short" expression: 0.41ms
Evaluating the "long" expression: 0.44ms
"short" vs. "long" with the parsing time subtracted: 0.31ms vs. 0.34ms
Thus we see that both expressions have approximately the same efficiency,
though in this concrete measurement the "short" was about 10% faster than
the "long" (I suspect this difference is not statistically significant).
Cheers,
Dimitre
On Tue, Dec 28, 2021 at 4:23 PM Dimitre Novatchev
dnovatchev(_at_)gmail(_dot_)com <
xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:
On Tue, Dec 28, 2021 at 4:10 PM Michael Kay mike(_at_)saxonica(_dot_)com <
xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:
On 28 Dec 2021, at 23:54, Dimitre Novatchev
dnovatchev(_at_)gmail(_dot_)com <
xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:
$nodes[index-of($nodes ! generate-id(.), generate-id(.))[1]]
This seems a candidate for "the shortest solution" and it shouldn't be
inefficient, given a good optimizer:
It probably also gets a prize for the first practical use case of a
filter expression where the predicate is numeric and has different values
for different nodes in the input sequence.
It's going to be O(n*m) unless index-of() is optimized to use some kind
of index or hash lookup rather than a sequential search. That's assuming
that the expression $nodes ! generate-id(.) gets loop-lifted; if it isn't,
then it becomes O(n*n*m).
Seems BaseX is good enough to do this. I increased the number of nodes in
$nodes 3 times and there was no increase in the evaluation time.
Aesthetically, I find generate-id() ugly and it would be nice to avoid it.
Its name is ugly, yes. A shorter and more meaningful name, like id() or
key() would be much better. Maybe we need a mechanism in XPath 4.0 to
specify global aliases (like a using file... )
Cheers,
Dimitre
Michael Kay
Saxonica
--
Cheers,
Dimitre Novatchev
---------------------------------------
Truly great madness cannot be achieved without significant intelligence.
---------------------------------------
To invent, you need a good imagination and a pile of junk
-------------------------------------
Never fight an inanimate object
-------------------------------------
To avoid situations in which you might make mistakes may be the
biggest mistake of all
------------------------------------
Quality means doing it right when no one is looking.
-------------------------------------
You've achieved success in your field when you don't know whether what
you're doing is work or play
-------------------------------------
To achieve the impossible dream, try going to sleep.
-------------------------------------
Facts do not cease to exist because they are ignored.
-------------------------------------
Typing monkeys will write all Shakespeare's works in 200yrs.Will they
write all patents, too? :)
-------------------------------------
Sanity is madness put to good use.
-------------------------------------
I finally figured out the only reason to be alive is to enjoy it.
XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list>
EasyUnsubscribe <http://lists.mulberrytech.com/unsub/xsl-list/782854> (by
email <>)
--
Cheers,
Dimitre Novatchev
---------------------------------------
Truly great madness cannot be achieved without significant intelligence.
---------------------------------------
To invent, you need a good imagination and a pile of junk
-------------------------------------
Never fight an inanimate object
-------------------------------------
To avoid situations in which you might make mistakes may be the
biggest mistake of all
------------------------------------
Quality means doing it right when no one is looking.
-------------------------------------
You've achieved success in your field when you don't know whether what
you're doing is work or play
-------------------------------------
To achieve the impossible dream, try going to sleep.
-------------------------------------
Facts do not cease to exist because they are ignored.
-------------------------------------
Typing monkeys will write all Shakespeare's works in 200yrs.Will they write
all patents, too? :)
-------------------------------------
Sanity is madness put to good use.
-------------------------------------
I finally figured out the only reason to be alive is to enjoy it.
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--