You may be hoping for too much. Graph algorithms such as looking for cycles
often have complexity of O(n^2) or worse, whatever language they are
implemented in.
Michael Kay
-----Original Message-----
From: Phil Endecott [mailto:spam_from_xslt_list(_at_)chezphil(_dot_)org]
Sent: 28 July 2004 22:19
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: [xsl] Spotting "cousin marriages" in a tree
Dear XSLT experts,
I have a flat input like this:
<things>
<thing id="1"> <child idref="2"/> <child idref="3"/> </thing>
<thing id="2"/>
<thing id="3"/>
</things>
I want to match up the ids with the idrefs to build a tree that looks
like this:
<tree id="1">
<tree id="2"/>
<tree id="3"/>
</tree>
I could do it like this:
<xsl:template match="thing">
<tree>
<xsl:for-each select="child">
<xsl:apply-templates
select="/things/thing[(_at_)id=current()/@idref]"/>
</xsl:for-each>
</tree>
</xsl:template>
But that is O(n^2), so I'd instead do it like this:
<xsl:key name="things-by-id" match="thing" use="@id"/>
<xsl:template match="thing">
<tree>
<xsl:for-each select="child">
<xsl:apply-templates select="key('things-by-id',@idref)"/>
</xsl:for-each>
</tree>
</xsl:template>
which is O(n log n).
This is all fine. But very rarely my input may not describe a pure
tree; it may contain "cousin marriages" or more "incestuous"
relationships like this:
<things>
<thing id="1"> <child idref="2"/> <child idref="3"/> </thing>
<thing id="2"> <child idref="4"/> </thing>
<thing id="3"> <child idref="4"/> </thing>
<thing id="4"/>
</things>
If you apply either of the above templates to this, thing 4
appears in
the output twice:
<tree id="1">
<tree id="2">
<tree id="4"/>
</tree>
<tree id="3">
<tree id="4">
</tree>
</tree>
which is very bad in my application. What I need is
something like this:
<tree id="1">
<tree id="2">
<tree id="4"/>
</tree>
<tree id="3">
<error/>
</tree>
</tree>
I can't afford for this to become an O(n^2) operation.
The obvious thought is that I need to check at the point of recursion
whether the thing I am about to process has already been output. But
there is no "preceding-in-the-output" axis or other way I can
think of
to note which things have already been output.
An alternative is to apply a test to the other input elements
to see if
other things have the same thing as this one as a child, but
that fails
because my input may legitimately contain redundant nodes like this:
<things>
<thing id="1"> <child idref="2"/> <child idref="3"/> </thing>
<thing id="2"/>
<thing id="3"/>
<thing id="9"> <child idref="2"/> </thing>
</things>
In this case I want the same output as the very first example, i.e.
thing 9 doesn't appear in the output at all.
I can do a multi-pass process using exsl:node-set(),
generating the tree
with the duplicates in it and then deleting them if they are
duplicates,
i.e. something like this:
<xsl:template match="/">
<xsl:variable name="n">
<xsl:apply-templates select="things/thing[1]"/> (as above)
</xsl:variable>
<xsl:apply-templates select="exsl:node-set($n)"
mode="remove-dupes"/>
</xsl:template>
<xsl:template match="tree" mode="remove-dupes">
<xsl:choose>
<xsl:when test="preceding::tree[(_at_)id=current/@id]">
<error/>
</xsl:when>
<xsl:otherwise>
<tree id="{(_at_)id}">
<xsl:apply-templates mode="remove-dupes"/>
</tree>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
There are two problems with this. First, generating the intermediate
result could run forever if the input is seriously malformed (e.g. a
loop, child refers to parent); checking as it is generated
would avoid
this. Second, the test "preceding::tree[(_at_)id=current/@id]" is O(n^2).
Can I use a key to avoid this? How can a key be applied to an
exsl:node-set() result?
XSLT often needs some lateral thinking to come up with a good
solution.
I hope that someone who's read this far has a trick to
solve this for me.
Thanks in advance!
--Phil.
--+------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail:
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--+--