xsl-list
[Top] [All Lists]

Re: difference between Result Tree Fragment (RTF) and a nodeset

2003-04-25 19:10:53
Leena Kulkarni wrote:
Hi all,
What is the difference between Result Tree Fragment
(RTF) and a nodeset?

A node-set is a set of nodes. Being a set, there is no repetition, and no
order to the nodes, although each node might have an implicit order relative
to some other nodes of the document from which it orginated.

A node is an abstract "thing" that may have a relationship to other nodes. In
the case of XPath/XSLT, nodes have types (element, attribute, text, etc.), and
the relationship forms a tree, where there's one "root" node, and all other
nodes are children, grandchildren, etc. of that node, or they have a special
relationship to certain types of nodes, such as element nodes can have
attribute and namespace nodes associated with them, where the element is the
attribute or namespace node's parent, but those nodes aren't the element's
children. All of these nodes represent the logical structures you find in a
parsed XML document.

XPath gives you a way to use an "expression" to represent an object. An
expression might be a location path like /data/users/user, or it might be
something like 5+3 or someFunction() or someFunction() != 'someResultString'.

XPath has 4 built-in types of objects: boolean, number, string, node-set.
XPath 1.0's spec doesn't make it clear, but it also supports unknown/external
types. XSLT adds one such type: result tree fragment.

A result tree fragment is a special type of node-set. It is created by the
xsl:variable or xsl:param instruction, but only when the xsl:variable or
xsl:param element has content. Thus,

  <xsl:variable name="a">foo</xsl:variable>
  <xsl:variable name="b">
    <asdf/>
  </xsl:variable>

both create RTFs, while

  <xsl:variable name="c" select="1"/>
  <xsl:variable name="d" select="'hello world'"/>
  <xsl:param name="e"/>

do not. ($c is a number, $d a string, $e an empty string)

A result tree fragment is actually not the set of all nodes created in the
variable binding element's content. It's just one root node. But remember how
I said nodes are related to each other... I can point to one node in a
document and my node-set consists of just that node, but I can still access
that node's parent, children, siblings, etc., now that I have a way of
referring to the one node. It works the same with RTFs. For example:

  <xsl:variable name="myRtf">
    <elem/>
    <elem>hello</elem>
  </xsl:variable>

The code above creates a result tree fragment and binds it to the variable
myRtf, so XPath expressions can now refer to $myRtf when they want to use the
fragment. The fragment consists of 1 root node. That node has 2 'elem' element
nodes as children, and one of those has a 'hello' text node as its child.

XSLT 1.0 imposes a fundamental limitation on result tree fragments: they can
only be used in XPath expressions where strings can be used. Thus you can't
say $myRtf/elem[2]/text() to identify the 'hello' text node in the example
above. You can say <xsl:copy-of select="$myRtf"/>, though. And 
<xsl:value-of select="$myRtf"/> will work as you would expect it to.
This will also work as it should, but not how you would expect:
<xsl:if test="$myRtf">... if you think of boolean($someNodeSet) you'll
see that it returns true when there is a node in the node-set; well, a
result tree fragment always has 1 node, so it's always true.

It was thought that imposing this limitation would allow XSLT processors to
optimize better, since they wouldn't have to worry about source trees that
were created at runtime. But this has turned out to be something that would be
nice to have if only 9 million other impediments to optimization were out of
the way first. Also, the ability to promote an RTF to a node-set is something
that has been built into every XSLT processor as an extension function,
because it's in such high demand, and it was a major impetus to the
development of the EXSLT specification. The abandoned XSLT 1.1 proposal
and the forthcoming XSLT 2.0 eliminated result tree fragments, just saying
that xsl:variable and xsl:param create node-sets.

My perception is RTF is a set of nodes in user defined
tags like -

<mytag>
<value/>
<value/>
</mytag>

and nodeset is the set of nodes got from the input doc
itself.

In XSLT 1.0, typically that's the way it works out: RTFs are created at
runtime, and node-sets are typically source tree nodes identified by XPath
expressions in the stylesheet, but there's no reason an XPath expression
couldn't return an RTF... the RTF could be produced by an extension function
or could have been passed in as a top-level param. And a node-set could just
as easily be fabricated by an extension function at runtime, without any
relation to a source doc.

Mike

-- 
  Mike J. Brown   |  http://skew.org/~mike/resume/
  Denver, CO, USA |  http://skew.org/xml/

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list