xsl-list
[Top] [All Lists]

Re: [xsl] A compelling use case for employing binary trees in XML processing?

2012-12-12 08:40:11
Another good use of a balanced binary search tree -- with a little
additional work a BST can be used to produce in O(log(N)) the rank of
an item in a collection (regarded as sorted), or to access an item in
a collection (regarded as sorted) by index.

So if a balanced BST contains: 1, 3, 8, 12, 17, 23, 37  -- we can
determine that the rank of 17 is 5 and that the 4th item in this
sorted sequence is 12.

To do so, we need to store with each tree node, the count of nodes in
its left subtree.

Let's remember that accessing an item by its position in a sequence is
generally O(N) operation (some XSLT processors, like Saxon optimize
sequences by using vectors, but such optimizations can't be relied on
accross different XSLT processors). And of course, sorting is
O(N*log(N)). One can't have efficient binary search in a sequence,
because a sequence isn't an array, and expressions like $seq[$mid]
are O(N) -- not O(1).


Cheers,
Dimitre

On Sun, Dec 9, 2012 at 4:49 PM, Dimitre Novatchev 
<dnovatchev(_at_)gmail(_dot_)com> wrote:
I understand that binary trees can be used to sort data. But XSLT already 
has <xsl:sort>, so using binary trees for
sorting isn't a particularly compelling use case.

I guess you mean "binary search trees* -- other kinds of binary trees
such as the heap data structure do not have the complete set of their
items sorted.

A binary *search* tree (and I mean *balanced* binary serch tree) has
more useful properties than just that it one of its serializations
presents the values in sorted order.

A binary search tree  allows searches and insertions with efficiency
of  O(log(N)). In contrast, an insertion in an array or in a sequence
is O(N) -- the whole array needs to be copied (even the non-functional
array) and the initial part of a sequence (the sub-sequence before the
insertion point) of a persistent (functional sequence) needs also to
be copied -- this is on average N / 2 -- so again we have O(N) here.

The deletion in a balanced binary serch tree  is also O(log(N)).

This makes the balanced binary search tree quite good even for
implementing dictionaries and sets. A disadvantage for such binary
search tree -based  implementations is that the value-space must have
total ordering (while other implementations of a dictionary or set
require only an equality (equivalence) operation).


Cheers,

Dimitre.

On Sun, Dec 9, 2012 at 2:49 PM, Costello, Roger L. 
<costello(_at_)mitre(_dot_)org> wrote:
Hi Folks,

Dimitre has created a beautiful set of functions for building binary trees 
[1].

I understand that binary trees can be used to sort data. But XSLT already 
has <xsl:sort>, so using binary trees for sorting isn't a particularly 
compelling use case.

I am seeking a compelling use case for binary trees -- given XML document 
foo.xml and processing requirement P, a binary tree is ideally suited.

Would you provide a simple, compelling use case please?

/Roger

[1] 
http://dnovatchev.wordpress.com/2012/01/09/the-binary-search-tree-data-structurehaving-fun-with-xpath-3-0/

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--




-- 
Cheers,
Dimitre Novatchev
---------------------------------------
Truly great madness cannot be achieved without significant intelligence.
---------------------------------------
To invent, you need a good imagination and a pile of junk
-------------------------------------
Never fight an inanimate object
-------------------------------------
To avoid situations in which you might make mistakes may be the
biggest mistake of all
------------------------------------
Quality means doing it right when no one is looking.
-------------------------------------
You've achieved success in your field when you don't know whether what
you're doing is work or play
-------------------------------------
Facts do not cease to exist because they are ignored.
-------------------------------------
Typing monkeys will write all Shakespeare's works in 200yrs.Will they
write all patents, too? :)
-------------------------------------
I finally figured out the only reason to be alive is to enjoy it.

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--