Re: [xsl] Re: Checking for Cycles in a Graph

For an XSLT 1.0 solution see also:

   http://lists.xml.org/archives/xml-dev/200401/msg00444.html

My recent blog entry on implementing a generic closure() function in
FXSL also has an example with the "reachable" relation on the set of
nodes of a graph:

  http://dnovatchev.spaces.live.com/Blog/cns!44B0A32C2CCF7488!384.entry


-- 
Cheers,
Dimitre Novatchev
---------------------------------------
Truly great madness cannot be achieved without significant intelligence.
---------------------------------------
To invent, you need a good imagination and a pile of junk
-------------------------------------
Never fight an inanimate object
-------------------------------------
You've achieved success in your field when you don't know whether what
you're doing is work or play



On Wed, Apr 30, 2008 at 6:17 PM, Michael Kay <mike(_at_)saxonica(_dot_)com> 
wrote:


I think I found a bug in some example code from XSLT 2.0
Programmer's Reference by Michael Kay and would like to
review it with Michael and/or other XSLT programmers.


You are right, this code published in the 3rd edition is embarrassingly
wrong. There's a corrected version in the 4th edition, which I believe is
now shipping. Sorry about the inconvenience.

The corrected algorithm passes an extra parameter on each recursive call,
which is the list of nodes marking the route from the starting node to the
node currently being visited. If the nto currently being visited contains a
reference to any of the nodes in this list, there is a cycle in the data,
which can be reported.in

Most of the textbook algorithms for detecting cycles in graphs are written
in a procedural way, and the error arose because I tried to rethink the
problem in functional terms (and failed to get it right).

Michael Kay
http://www.saxonica.com/


p.199 Checking for Cycles in a Graph:

<xsl:function name="graph:refers" as="xs:boolean">
   <xsl:param name="links" as="node()"/>
   <!-- $links is a node that represents the template to be called -->
   <xsl:param name="A" as="node()"/>
   <xsl:param name="B" as="node()"/>

   <!-- find the directly-connected nodes -->
   <xsl:variable name="direct" as="node()*">
      <xsl:apply-templates select="$links">
         <xsl:with-param name="from" select="$A"/>
      </xsl:apply-templates>
   </xsl:variable>

   <!-- return true if B is directly or indirectly connected
from A -->
   <xsl:sequence select="if (exists($direct intersect $B)) then true()
                         else if (some $C in $direct
                                  satisfies graph:refers($links, $C,
$B)) then true()
                         else false()"/> </xsl:function>

I have defined a template for $links as required.  (But it is
not necessary to see the bug because I think the bug is in
the higher order function noted above.)

Typically cycles are checked for by evaluating:
graph:refers($something, $something).  If there is connection
through $something's direct links and their direct links and
so on, then there is a cycle.

The problem is that "graph:refers" may enter an infinite loop
if $direct contains a cycle.  Consider the case where $A does
not refer to $B but one of $A's direct links contain a cycle.
 In the XPath for xsl:sequence, the function recursively
calls itself with "graph:refers($links, $C, $B)".  $C becomes
$A in the next iteration, and its direct links are found in
order to test if one of them refers back to $B.  But notice
that there is no check if $C contains a cycle!
 So it is possible that $C's direct links could be navigated
forever....without reaching $B first.  (I am actually
experiencing this problem with a large data set and haven't
had time to make a simpler use-case.  But I am hoping that
people can follow the problem generically with the thought
exercise I just described.)

So it seems like $C the algorithm should check if $C contains
a cycle before testing "graph:refers($links,$C,$B)"... but
after some thinking, I am realizing that just because $C
contains a cycle, doesn't mean that $A doesn't refers to $B.
Perhaps the solution is to test if $C contains a cycle, and
if it does, remove the cycle(s) -> creating $D.  And then
test "graph:refers(l$inks,$D,$B)".

This seems complicated though... and I thought I should
discuss this issue with others before I continue to hack at
this algorithm.  It seems like this type of algorithm/problem
would be well-known for graphs?

Has anyone explored this problem?  Any suggestions?

Thanks
Darcy

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--



--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--