xsl-list
[Top] [All Lists]

Re: [xsl] XQuery/XPath 3.1: Node List to Node Set ("distinct nodes")

2021-12-29 12:27:16
On 29.12.2021 17:36, Dimitre Novatchev dnovatchev(_at_)gmail(_dot_)com wrote:


On Wed, Dec 29, 2021 at 12:21 AM Martin Honnen 
martin(_dot_)honnen(_at_)gmx(_dot_)de
<mailto:martin(_dot_)honnen(_at_)gmx(_dot_)de> 
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com
<mailto:xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com>> wrote:


    Am 29.12.2021 um 00:32 schrieb Dimitre Novatchev
    dnovatchev(_at_)gmail(_dot_)com <mailto:dnovatchev(_at_)gmail(_dot_)com>:




    Hit Send too early:

    Do notice: this seems the only solution of all presented so far,
    that preserves the original sequence order (not document order) of
    the nodes.

    Why is the original sequence order preserved?
    https://www.w3.org/TR/xpath-functions/#func-distinct-values
    <https://www.w3.org/TR/xpath-functions/#func-distinct-values>
    clearly says

    "The function returns the sequence that results from removing
    from|$arg|all but one of a set of values that are considered equal
    to one another. [...]

    The order in which the sequence of values is returned
    is·implementation-dependent·
    <https://www.w3.org/TR/xpath-functions/#implementation-dependent>.

    Which value of a set of values that compare equal is returned
    is·implementation-dependent·
    <https://www.w3.org/TR/xpath-functions/#implementation-dependent>."


    So while

        $nodes ! generate-id(.)

    gives you the generated ids in the order of the nodes in $nodes
    after the call to distinct-values there is no order defined, it is
    implementation dependent.


@Martin Honnen <mailto:Martin(_dot_)Honnen(_at_)gmx(_dot_)de> Could you, 
please, give us
an example of an existing XPath engine whose implementation of
`distinct-values()` produces its results in any other order than their
original order in the input sequence?

I don't have to know one, I just pointed out that the spec doesn't
guarantee the order. Thus I don't see why, given the spec, one should
expect any implementation to preserve the order.

Imagine you implement distinct-values in .NET with e.g.
https://docs.microsoft.com/en-us/dotnet/api/system.linq.enumerable.distinct?view=net-6.0
and it would probably pass all tests but also only give a "result
sequence" that " is unordered".

Aren't there also implementations of XQuery or XPath that exploit
parallel processing? I could imagine such an implementation to easily
not care about ordering if the spec allows it for distinct-values.

It seems, on the other hand, eXide of eXist-db in the online version
doesn't even grok some of the generate-id based attempts:

let $nodes := (1 to 10) ! parse-xml-fragment('<node>' || . ||
'</node>')/node(),
    $nodes := (1 to 5) ! $nodes,
    $ids := distinct-values($nodes ! generate-id(.))
return  $ids ! (function($id) {$nodes[generate-id(.) eq $id][1]})(.)

gives <node>1</node>



Any eXist-db users reading here? Is there a known issue with generate-id?
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--


<Prev in Thread] Current Thread [Next in Thread>