xsl-list
[Top] [All Lists]

Re: [xsl] Word Ladders as an example of a "Find shortest path between two nodes in a graph" problem

2012-12-06 14:57:16
On Thu, Dec 6, 2012 at 10:29 AM, Hermann Stamm-Wilbrandt
<STAMMW(_at_)de(_dot_)ibm(_dot_)com> wrote:
Hi Dimitre,

you are right with O(|E|), but not with O(N^2).
Implementation of "5.c" was a bit sloppy, initEdges() is indeed O(N^2).

But if you have N real language words of length k, each single word can
have at most  k*26 = O(1) neighbors.

Testing whether a word "c_1 c_2 ... c_k" is among the "words" can be done
in
constant time, too, if your language allows for constant time array access
(just use an array of size 26^k = O(1)).

Sure, if you would allocate 8GB for this array (for k = 7).

Also, calculating the index in the array requires k steps (and
probably multiplication and addition at each such step).


Now with each node having constant number of neighbors O(|E|) = O(|V|) = O
(N).


I was speaking about graph traversal in general. One may argue that
the big O notation isn't too useful when the value of N is limited and
not too-big.

For example 26*N in this particular case is  bigger than log2(N) * N
(unless we have more than 3 million words). So, this "O(N)"
implementation  may take more actual time than an "O(N*log(N)))

To put it in other words, the constants do matter when N is limited.


Cheers,

Dimitre



Mit besten Gruessen / Best wishes,

Hermann Stamm-Wilbrandt
Level 3 support for XML Compiler team and Fixpack team lead
WebSphere DataPower SOA Appliances
https://www.ibm.com/developerworks/mydeveloperworks/blogs/HermannSW/
https://twitter.com/#!/HermannSW/
----------------------------------------------------------------------
IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294


|------------>
| From:      |
|------------>
  
------------------------------------------------------------------------------------------------------------------------------------------|
  |Dimitre Novatchev <dnovatchev(_at_)gmail(_dot_)com>                        
                                                                          |
  
------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| To:        |
|------------>
  
------------------------------------------------------------------------------------------------------------------------------------------|
  |xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com,                          
                                                                              
  |
  
------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Date:      |
|------------>
  
------------------------------------------------------------------------------------------------------------------------------------------|
  |12/06/2012 07:11 PM                                                        
                                                               |
  
------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Subject:   |
|------------>
  
------------------------------------------------------------------------------------------------------------------------------------------|
  |Re: [xsl] Word Ladders as an example of a "Find shortest path between two 
nodes in a graph" problem                                       |
  
------------------------------------------------------------------------------------------------------------------------------------------|





Hi Herman,

I think that in a near-worst case the time-complexity is O(N^2),
because every arc needs to be inspected and in a well-connected graph
with N vertices there can be N*(N-1) vertices.

This is also confirmed by Wikipedia and the 2-3 books on algoritms I have:

"The time complexity can be expressed as O(|E|) since every vertex and
every edge will be explored in the worst case. Note: O(|E|) may vary
between O(|V|) and O(|V|^2), depending on how sparse the input graph
is (assuming that the graph is connected)".


To summarize, this is a property of the graph and not of the
programming language that is used.


Cheers,
Dimitre



On Thu, Dec 6, 2012 at 10:01 AM, Hermann Stamm-Wilbrandt
<STAMMW(_at_)de(_dot_)ibm(_dot_)com> wrote:
Hi Dimitre,

That I have initially used a not probably the most efficient algorithm
/ implementation, shouldn't be used to make general conclusions about
the appropriateness of using XSLT in solving a particular class of
problems.

I agree with you -- and your solution is nice.

But breadth-first-search algorithm can be implemented as linear time
algorithm in C or C++ -- I doubt that you can do linear time
implementation in XSLT since constant time array access is missing ...


Mit besten Gruessen / Best wishes,

Hermann Stamm-Wilbrandt
Level 3 support for XML Compiler team and Fixpack team lead
WebSphere DataPower SOA Appliances
https://www.ibm.com/developerworks/mydeveloperworks/blogs/HermannSW/
https://twitter.com/#!/HermannSW/
----------------------------------------------------------------------
IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294


|------------>
| From:      |
|------------>

------------------------------------------------------------------------------------------------------------------------------------------|

  |Dimitre Novatchev <dnovatchev(_at_)gmail(_dot_)com>
|

------------------------------------------------------------------------------------------------------------------------------------------|

|------------>
| To:        |
|------------>

------------------------------------------------------------------------------------------------------------------------------------------|

  |xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com,
|

------------------------------------------------------------------------------------------------------------------------------------------|

|------------>
| Date:      |
|------------>

------------------------------------------------------------------------------------------------------------------------------------------|

  |12/06/2012 05:50 PM
|

------------------------------------------------------------------------------------------------------------------------------------------|

|------------>
| Subject:   |
|------------>

------------------------------------------------------------------------------------------------------------------------------------------|

  |Re: [xsl] Word Ladders as an example of a "Find shortest path between
two nodes in a graph" problem                                       |

------------------------------------------------------------------------------------------------------------------------------------------|






Herman,

That I have initially used a not probably the most efficient algorithm
/ implementation, shouldn't be used to make general conclusions about
the appropriateness of using XSLT in solving a particular class of
problems.

Cheers,

Dimitre

On Thu, Dec 6, 2012 at 8:15 AM, Hermann Stamm-Wilbrandt
<STAMMW(_at_)de(_dot_)ibm(_dot_)com> wrote:
... I think that this
isn't something that should be solved in XSLT at all, except as an
academic exercise. ...

Agreed, nice XSLT solution, but not fast.

This simple C program does find the longest path (35) to angry in a
second (on a W520 Thinkpad) based on Dimitie's word list of length 5:
http://www.stamm-wilbrandt.de/en/xsl-list/5.c

$ time ./5 angry
yasht
yacht
pacht
pecht
wecht
wicht
wight
dight
digit
dimit
demit
remit
refit
befit
besit
beset
besee
belee
belve
beeve
breve
brave
brace
braca
araca
arara
amara
amala
alala
alula
aluta
abuta
abura
anura
anury
angry
35

real    0m1.046s
user    0m1.039s
sys     0m0.004s
$


Mit besten Gruessen / Best wishes,

Hermann Stamm-Wilbrandt
Level 3 support for XML Compiler team and Fixpack team lead
WebSphere DataPower SOA Appliances
https://www.ibm.com/developerworks/mydeveloperworks/blogs/HermannSW/
https://twitter.com/#!/HermannSW/
----------------------------------------------------------------------
IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294


|------------>
| From:      |
|------------>

------------------------------------------------------------------------------------------------------------------------------------------|


  |Wolfgang Laun <wolfgang(_dot_)laun(_at_)gmail(_dot_)com>
|

------------------------------------------------------------------------------------------------------------------------------------------|


|------------>
| To:        |
|------------>

------------------------------------------------------------------------------------------------------------------------------------------|


  |xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com,
|

------------------------------------------------------------------------------------------------------------------------------------------|


|------------>
| Date:      |
|------------>

------------------------------------------------------------------------------------------------------------------------------------------|


  |11/28/2012 07:15 PM
|

------------------------------------------------------------------------------------------------------------------------------------------|


|------------>
| Subject:   |
|------------>

------------------------------------------------------------------------------------------------------------------------------------------|


  |Re: [xsl] Word Ladders as an example of a "Find shortest path between
two nodes in a graph" problem                                       |

------------------------------------------------------------------------------------------------------------------------------------------|







The fact that one XSLT program runs three times faster on one XSLT
implementation
than on another one is strange, *very* strange. But is Saxon 6.4 the
"dernier cri"?
I'd very much like to hear Michael Kay's opinion on this.

With Saxon HE 9.2.0 running with the -t option, I compare execution
times:
   1209 ms with 40065592 bytes for WL's solution
to
   2768 ms with 81184768 bytes for DN's solution.

Note: DN's solution being the one *without* the optimizations!

Not that this is conclusive. Algorithms like this one must be judged
by more than a single run:
they may behave well for small word lengths and small ladder sizes,
and scale badly, or
the other way round. (Dimitre and I aren't even using the same word
data, AFAIK.)

As an aside, I'd like to say that neither DN's nor WL's solution is
something that should
be used if this problem (i.e., shortest path) should ever need a
solution. I think that this
isn't something that should be solved in XSLT at all, except as an
academic exercise.
(Feel free to disagree - I'll not reply to anything contradicting me.)

Cheers


Ok, I was running it with Saxon 6.4

Now, the times are:

With Saxon:

Wolfgang's transformation: 25sec.

Dimitre's :                            39sec.


However, with XQSharp:

Wolfgang's transformation: 23sec.

Dimitre's :                            14sec.


Therefore, one can't say wich transformation is faster -- it depends
on the XSLT processor being used.


Cheers,
Dimitre



On Wed, Nov 28, 2012 at 7:27 AM, Dimitre Novatchev
<dnovatchev(_at_)gmail(_dot_)com>
wrote:
I get this error, trying to run your code:

SAXON 6.5.4 from Michael Kay
Java version 1.6.0_31
Loading my:my
Preparation time: 250 milliseconds
Processing file:/C:\XSLT Projects\WordLadders\Ver 0.2\dictGraph4.xml
Building tree for file:/C:\XSLT Projects\WordLadders\Ver
0.2\dictGraph4.xml using class com.icl.saxon.tinytree.TinyBuilder
Tree built in 351 milliseconds
Error at xsl:variable on line 23 of file:/(Untitled):
  Error in expression key('kFindWord', $pStartWord, $vDictGraph)
                    [count(../*)  lt  count(key('kFindWord',
$pTargetWord, $vDictGraph)/../* )]                        |
            key('kFindWord', $pTargetWord, $vDictGraph)
           [count(../*) le count(key('kFindWord',  $pStartWord,
$vDictGraph)/../*)]: expected "]", found "<name>"


Cheers,
Dimitre

On Wed, Nov 28, 2012 at 5:40 AM, Wolfgang Laun
<wolfgang(_dot_)laun(_at_)gmail(_dot_)com>
wrote:
<xsl:stylesheet version="2.0"
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
   xmlns:my="my:my"
   xmlns:xs="http://www.w3.org/2001/XMLSchema";
   exclude-result-prefixes="my xs">

  <xsl:output method="text"/>

  <xsl:variable name="vDictGraph" select="/"/>
  <xsl:key name="kFindWord" match="w" use="."/>

  <xsl:param name="pStartWord"  select="'nice'" as="xs:string"/>
  <xsl:param name="pTargetWord" select="'evil'" as="xs:string"/>

  <xsl:variable name="vStartWord" as="xs:string"
                select="key('kFindWord', $pStartWord, $vDictGraph)
                         [count(../*)  lt  count(key('kFindWord',
$pTargetWord, $vDictGraph)/../* )]
                      |
                      key('kFindWord', $pTargetWord, $vDictGraph)
                         [count(../*) le count(key('kFindWord',
$pStartWord, $vDictGraph)/../*)]"/>

  <xsl:variable name="vTargetWord" as="xs:string"
                select="($pStartWord, $pTargetWord)[not(. eq
$vStartWord)]"/>

  <!-- This function iterates over the temporary tree



<result><arc level=".." from=".." to=".."/>...</result>
    to find

the
ladder. It starts at a node matching @to with
$vTargetWord


and
proceeds with decreasing @level. -->
  <xsl:function name="my:find-path" as="xs:string*">
    <xsl:param name="root"   as="node()"/>
    <xsl:param name="level"  as="xs:integer"/>
    <xsl:param name="start"  as="xs:string"/>
    <xsl:param name="target" as="xs:string"/>
    <xsl:param name="path"   as="xs:string"/>

    <xsl:for-each select="$root/result/arc[@level = $level and @to =
$target]">
      <xsl:variable name="from" select="./@from"/>
      <xsl:choose>
        <xsl:when test="$start eq $from">
          <xsl:value-of select="concat($from,'+',$path)"/>
        </xsl:when>
        <xsl:otherwise>
          <xsl:value-of select="my:find-path($root,$level
-1,$start,$from,concat($from,'+',$path))"/>
        </xsl:otherwise>
      </xsl:choose>
    </xsl:for-each>
  </xsl:function>

  <xsl:template match="/">
    <xsl:variable name='arcs'>
      <result>
      <xsl:call-template name="look-at-starts">
        <xsl:with-param name="level"  select="1"/>
        <xsl:with-param name="starts" select="$vStartWord"/>
        <xsl:with-param name="target" select="$vTargetWord"/>
        <xsl:with-param name="toskip" select="()"/>
      </xsl:call-template>
      </result>
    </xsl:variable>

    <xsl:variable name="finalArcs" select="$arcs/result/arc[@to =
$vTargetWord]"/>
    <xsl:value-of select="my:find-path($arcs, $finalArcs[1]/@level,
$vStartWord, $vTargetWord, $vTargetWord)"/>
  </xsl:template>

  <!-- Look at $starters nodes obtained from the current set of
words

     ending all incomplete ladders. Generate result/arc for
each hop
to

     the next step. Recurse if none of the arc
destinations is the

overall
     target word, otherwise
return the last hop. -->
  <xsl:template name="look-at-starts">
    <xsl:param name="level"  as="xs:integer"/>
    <xsl:param name="starts" as="xs:string*"/>
    <xsl:param name="target" as="xs:string"/>
    <xsl:param name="toskip" as="node()*"/>

    <xsl:variable name="starters" as="node()*"
                  select="key('kFindWord', $starts, $vDictGraph)/..
except $toskip"/>

    <xsl:for-each select="$starters">
      <xsl:variable name="w" select="./w"/>
      <xsl:for-each select="./nb">
        <arc level="{$level}" from="{$w}" to="{.}"/>
      </xsl:for-each>
    </xsl:for-each>

    <xsl:variable name="nbs" select="$starters/nb"/>

    <xsl:choose>
      <xsl:when test="$target = $nbs">
        <!--xsl:message select="'found a ladder'"/-->
      </xsl:when>
      <xsl:otherwise>
        <xsl:call-template name="look-at-starts">
          <xsl:with-param name="level"  select="$level + 1"/>
          <xsl:with-param name="starts"
select="distinct-values($nbs)"/>
          <xsl:with-param name="target" select="$target"/>
          <xsl:with-param name="toskip" select="$toskip union
$starters"/>
        </xsl:call-template>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>
</xsl:stylesheet>




--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--

<Prev in Thread] Current Thread [Next in Thread>