xsl-list
[Top] [All Lists]

Re: [xsl] Function for determining one XPath as subset of another

2016-01-26 14:37:54
On 01/26/2016 01:24 PM, Michael Kay mike(_at_)saxonica(_dot_)com wrote:
I've always been a little frustrated that I rely heavily on equivalences like

preceding::x === ancestor-or-self::*/preceding-sibling::*/descendant-or-self::*

without having what I would consider a formal proof.

That should be a provable equivalence... er, assuming you meant to end the right side with descendant-or-self::x, or that the left side is meant to be preceding::*. I’ll make the latter assumption:

preceding::* === ancestor-or-self::*/preceding-sibling::*/descendant-or-self::*

The preceding axis is defined (in XPath 3.0) as containing “all nodes that are descendants of the root of the tree in which the context node is found, are not ancestors of the context node, and occur before the context node in document order.”

Document order, in turn, is defined as “a total ordering” “among all the nodes” satisfying, within a tree, 6 conditions:

<blockquote>
1) The root node is the first node.

2) Every node occurs before all of its children and descendants.

3) Namespace nodes immediately follow the element node with which they are associated. The relative order of namespace nodes is stable but implementation-dependent.

4) Attribute nodes immediately follow the namespace nodes of the element node with which they are associated. The relative order of attribute nodes is stable but implementation-dependent.

5) The relative order of siblings is the order in which they occur in the children property of their parent node.

6) Children and descendants occur before following siblings.
</blockquote>

First, let us note that as the preceding axis is defined as containing “descendants of the root of the tree,” and descendants are defined as children, recursively, and that “[t]he children of a document node or element node may be element, processing instruction, comment, or text nodes; [a]ttribute, namespace, and document nodes can never appear as children,” so we need not consider clauses 3 and 4 here.

Given that document order is a total ordering, all nodes either precede the context node, are the context node, or follow the context node.

Per the data model, we can observe that every “descendant of the root tree” is either an ancestor of the context node (call this Group 1); a sibling of nodes in Group 1 (call this Group 2); a descendant of nodes in Group 2 (call this Group 3); a sibling of the context node itself (call this Group 4), a descendant of Group 4 (call this Group 5), the context node itself, or a descendant of the context node (call this Group 6).

(This is itself provable: if the context node is the root tree, then every descendant of the root tree is a descendant of the context node. If the context node is a non-identical descendant of the root tree, then the root tree is its ancestor, and every descendant of the root tree is a descendant of an ancestor of the context node, which can be decomposed into direct ancestry and descendants of siblings of ancestors.)

By clause 2, all members of Group 1 precede the context node in document order.

By clause 5, all preceding siblings of Group 1 precede the context node. (Call this subset of Group 2, Group 2p.)

Also by clause 5, all preceding siblings of the context node precede it. (Call this subset of Group 4, Group 4p.)

By clause 6, all descendants of Groups 2p and 4p precede the context node. (These are subsets of Groups 3 and 5; call them Group 3p and Group 5p.)[*]

Of the nodes which precede the context node in document order (Groups 1, 2p, 3p, 4p, and 5p), the definition of the preceding axis excludes ancestors (Group 1), leaving Groups 2p, 3p, 4p, and 5p as members of the axis.

By exclusion, this leaves out the context node itself, all of Groups 1 and 6, and the rest of Groups 2, 3, 4, and 5 (call them 2f, 3f, 4f, and 5f).

We must now show that the right-hand expression selects an identical set of nodes (i.e., Groups 2p, 3p, 4p, and 5p).

The first step, ancestor-or-self::*, selects all of Group 1 and the context node itself.

The next step, preceding-sibling::*, selects all preceding siblings of Group 1 (i.e., Group 2p) and preceding siblings of the context node (i.e., Group 4p).

The next step, descendant-or-self::*, selects the members of the previous step (Groups 2p and 4p), as well as their descendants (Groups 3p and 5p).

That leaves us with the same set as selected by preceding::*.

Q.E.D. (I think).

~Chris

[*] At this point, “Group” no longer looks like a real word to me.
--
Chris Maden, text nerd  <URL: http://crism.maden.org/ >
“If you’ve been a man o’ action, though you’re lying there in traction,
 You will gain some satisfaction thinkin’, ‘Jesus, at least I tried.’”
  — Andy M. Stewart (1952–2015), “Ramblin’ Rover”
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--

<Prev in Thread] Current Thread [Next in Thread>