xsl-list
[Top] [All Lists]

RE: Normalize / Simplify HTML-Tables with row-span / col-span

2004-02-18 10:36:48
The FragmentValue class in Saxon 6 is an implementation of the data
model that tries to optimize for the usage pattern of XSLT 1.0 result
tree fragments, including use as a "node-set" where that is required. It
does this by building the tree lazily, delaying its construction until a
tree is actually needed. The initial structure is essentially a list of
SAX events. If the user asks for the string value, this is available
directly. If the user does xsl:copy-of, the SAX events are played back
to append nodes to the new tree. Neither of these cause the tree to be
built. On certain other constructs, one of which is the node-set()
function, a tree is built locally. Once the tree has been built, there
is a bit to indicate whether generalized usage of the tree is permitted.
The node-set() function is not the only thing that forces construction
of a tree, though I don't recall in detail what the other conditions are
(Saxon 6 was a long time ago...). There are messy tests scattered all
around the code trying to avoid triggering the building of the tree,
e.g. when doing an equality comparison and when calling the document()
function (which is special because it needs to know the base URI), but
they don't cover all cases.

The performance bug that you refer to occurs during the initial
construction of the data structure that's there to avoid the costs of
building a generalized tree. If there are any lessons to be drawn from
it, it is the lesson (which I've certainly learnt over the years with
Saxon) that every optimization risks introducing bugs, and some of these
bugs aren't found for a long time. It's not quite true that optimization
is the root of all evil, but it's certainly something you have to
balance against reliability.

In Saxon 7 I scrapped all of this and use the same tree model as is used
for source trees. It's much simpler, faster, and more reliable.

Michael Kay

-----Original Message-----
From: owner-xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com 
[mailto:owner-xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com] On Behalf Of 
David Tolpin
Sent: 18 February 2004 15:34
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: RE: [xsl] Normalize / Simplify HTML-Tables with 
row-span / col-span


No, you've got it completely the wrong way around! Saxon uses a 
different data structure for RTFs from its normal tree 
model, and it's 
this different data structure that has a bug in it!

Michael,

1) it was your words that the only thing node-set does is 
flipping a bit. 

2) i provided the test that shows that the same bug occurs 
whether node-set is used or not on an RTF.

3) the code to build RTF creates a subclass of SingletonNodeSet, 
FragmentValue; which is where the bug is. 

4) exslt/Common.java contains the code which I think is used 
to implement node-set, namely, method Common.nodeSet, which 
only calls allowGeneralUse for instances of SingletonNodeSet; 
the full text of this method is here:

    public void allowGeneralUse() {
        generalUseAllowed = true; 
    }

Where SingletonNodeSet, used to build RTF and represent 
node-set after the call to nodeSet is converted to a 
different structure? 


But you can't prove anything from bugs, especially 
performance bugs. 
They can happen anywhere at any time. This one happened 
because RTFs 
are usually small, which meant that a scalability problem 
in this area 
didn't show up in all the standard performance tests; the 
same bug in 
the tree model used for input documents would have shown up very 
quickly.

I am not proving, I am illustrating. And no, it didn't happen 
because of incomplete or insufficient testing. It happened 
because there is not a single word about required 
computational complexity of XSLT constructs. If the spec said 
that adding a node to RTF must not be more complex than O(n), 
it would not happen.

David Tolpin
http://davidashen.net/

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list



 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list