xsl-list
[Top] [All Lists]

Re: [xsl] Word Ladders as an example of a "Find shortest path between two nodes in a graph" problem

2012-12-07 02:21:54
Hello Hermann,

nevertheless comparisons should be made on an equal basis. The C program
doesn't search for a ladder from A to B (which might be implemented easily),
and it uses a compiled-in dictionary, which saves time.

When I compare 5.c "as is" to my (!) XSLT version of finding the ladder from
"yasht" to "angry", it's 1.3sec (/usr/bin/time) to 1.9sec (saxon9he
-t). (Dimitre's version #1 takes 26.3sec.)

And if I /usr/bin/time my Java version that finds *all* ladders
between "yasht" to "angry", it's 0.6sec...

But that's not the key issue for me. Let me put it this way: if you
would plan for a widely portable SW product "Word Ladders" (relying on
OS SW only) with the joint capabilities of finding all ladders between
words or all ladders of a given length, with a user interface for end
users and another one for administrators (for updating the
dictionary), what would you use?

Cheers
Wolfgang

On 06/12/2012, Hermann Stamm-Wilbrandt <STAMMW(_at_)de(_dot_)ibm(_dot_)com> 
wrote:
Hi Dimitre,

That I have initially used a not probably the most efficient algorithm
/ implementation, shouldn't be used to make general conclusions about
the appropriateness of using XSLT in solving a particular class of
problems.

I agree with you -- and your solution is nice.

But breadth-first-search algorithm can be implemented as linear time
algorithm in C or C++ -- I doubt that you can do linear time
implementation in XSLT since constant time array access is missing ...


Mit besten Gruessen / Best wishes,

Hermann Stamm-Wilbrandt
Level 3 support for XML Compiler team and Fixpack team lead
WebSphere DataPower SOA Appliances
https://www.ibm.com/developerworks/mydeveloperworks/blogs/HermannSW/
https://twitter.com/#!/HermannSW/
----------------------------------------------------------------------
IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294


|------------>
| From:      |
|------------>

------------------------------------------------------------------------------------------------------------------------------------------|
  |Dimitre Novatchev <dnovatchev(_at_)gmail(_dot_)com>
                                                                |

------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| To:        |
|------------>

------------------------------------------------------------------------------------------------------------------------------------------|
  |xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com,
                                                                |

------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Date:      |
|------------>

------------------------------------------------------------------------------------------------------------------------------------------|
  |12/06/2012 05:50 PM
                                                                |

------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Subject:   |
|------------>

------------------------------------------------------------------------------------------------------------------------------------------|
  |Re: [xsl] Word Ladders as an example of a "Find shortest path between two
nodes in a graph" problem                                       |

------------------------------------------------------------------------------------------------------------------------------------------|





Herman,

That I have initially used a not probably the most efficient algorithm
/ implementation, shouldn't be used to make general conclusions about
the appropriateness of using XSLT in solving a particular class of
problems.

Cheers,

Dimitre

On Thu, Dec 6, 2012 at 8:15 AM, Hermann Stamm-Wilbrandt
<STAMMW(_at_)de(_dot_)ibm(_dot_)com> wrote:
... I think that this
isn't something that should be solved in XSLT at all, except as an
academic exercise. ...

Agreed, nice XSLT solution, but not fast.

This simple C program does find the longest path (35) to angry in a
second (on a W520 Thinkpad) based on Dimitie's word list of length 5:
http://www.stamm-wilbrandt.de/en/xsl-list/5.c

$ time ./5 angry
yasht
yacht
pacht
pecht
wecht
wicht
wight
dight
digit
dimit
demit
remit
refit
befit
besit
beset
besee
belee
belve
beeve
breve
brave
brace
braca
araca
arara
amara
amala
alala
alula
aluta
abuta
abura
anura
anury
angry
35

real    0m1.046s
user    0m1.039s
sys     0m0.004s
$


Mit besten Gruessen / Best wishes,

Hermann Stamm-Wilbrandt
Level 3 support for XML Compiler team and Fixpack team lead
WebSphere DataPower SOA Appliances
https://www.ibm.com/developerworks/mydeveloperworks/blogs/HermannSW/
https://twitter.com/#!/HermannSW/
----------------------------------------------------------------------
IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294


|------------>
| From:      |
|------------>

------------------------------------------------------------------------------------------------------------------------------------------|

  |Wolfgang Laun <wolfgang(_dot_)laun(_at_)gmail(_dot_)com>
|

------------------------------------------------------------------------------------------------------------------------------------------|

|------------>
| To:        |
|------------>

------------------------------------------------------------------------------------------------------------------------------------------|

  |xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com,
|

------------------------------------------------------------------------------------------------------------------------------------------|

|------------>
| Date:      |
|------------>

------------------------------------------------------------------------------------------------------------------------------------------|

  |11/28/2012 07:15 PM
|

------------------------------------------------------------------------------------------------------------------------------------------|

|------------>
| Subject:   |
|------------>

------------------------------------------------------------------------------------------------------------------------------------------|

  |Re: [xsl] Word Ladders as an example of a "Find shortest path between
two nodes in a graph" problem                                       |

------------------------------------------------------------------------------------------------------------------------------------------|






The fact that one XSLT program runs three times faster on one XSLT
implementation
than on another one is strange, *very* strange. But is Saxon 6.4 the
"dernier cri"?
I'd very much like to hear Michael Kay's opinion on this.

With Saxon HE 9.2.0 running with the -t option, I compare execution
times:
   1209 ms with 40065592 bytes for WL's solution
to
   2768 ms with 81184768 bytes for DN's solution.

Note: DN's solution being the one *without* the optimizations!

Not that this is conclusive. Algorithms like this one must be judged
by more than a single run:
they may behave well for small word lengths and small ladder sizes,
and scale badly, or
the other way round. (Dimitre and I aren't even using the same word
data, AFAIK.)

As an aside, I'd like to say that neither DN's nor WL's solution is
something that should
be used if this problem (i.e., shortest path) should ever need a
solution. I think that this
isn't something that should be solved in XSLT at all, except as an
academic exercise.
(Feel free to disagree - I'll not reply to anything contradicting me.)

Cheers


Ok, I was running it with Saxon 6.4

Now, the times are:

With Saxon:

Wolfgang's transformation: 25sec.

Dimitre's :                            39sec.


However, with XQSharp:

Wolfgang's transformation: 23sec.

Dimitre's :                            14sec.


Therefore, one can't say wich transformation is faster -- it depends
on the XSLT processor being used.


Cheers,
Dimitre



On Wed, Nov 28, 2012 at 7:27 AM, Dimitre Novatchev
<dnovatchev(_at_)gmail(_dot_)com>
wrote:
I get this error, trying to run your code:

SAXON 6.5.4 from Michael Kay
Java version 1.6.0_31
Loading my:my
Preparation time: 250 milliseconds
Processing file:/C:\XSLT Projects\WordLadders\Ver 0.2\dictGraph4.xml
Building tree for file:/C:\XSLT Projects\WordLadders\Ver
0.2\dictGraph4.xml using class com.icl.saxon.tinytree.TinyBuilder
Tree built in 351 milliseconds
Error at xsl:variable on line 23 of file:/(Untitled):
  Error in expression key('kFindWord', $pStartWord, $vDictGraph)
                    [count(../*)  lt  count(key('kFindWord',
$pTargetWord, $vDictGraph)/../* )]                        |
            key('kFindWord', $pTargetWord, $vDictGraph)
           [count(../*) le count(key('kFindWord',  $pStartWord,
$vDictGraph)/../*)]: expected "]", found "<name>"


Cheers,
Dimitre

On Wed, Nov 28, 2012 at 5:40 AM, Wolfgang Laun
<wolfgang(_dot_)laun(_at_)gmail(_dot_)com>
wrote:
<xsl:stylesheet version="2.0"
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
   xmlns:my="my:my"
   xmlns:xs="http://www.w3.org/2001/XMLSchema";
   exclude-result-prefixes="my xs">

  <xsl:output method="text"/>

  <xsl:variable name="vDictGraph" select="/"/>
  <xsl:key name="kFindWord" match="w" use="."/>

  <xsl:param name="pStartWord"  select="'nice'" as="xs:string"/>
  <xsl:param name="pTargetWord" select="'evil'" as="xs:string"/>

  <xsl:variable name="vStartWord" as="xs:string"
                select="key('kFindWord', $pStartWord, $vDictGraph)
                         [count(../*)  lt  count(key('kFindWord',
$pTargetWord, $vDictGraph)/../* )]
                      |
                      key('kFindWord', $pTargetWord, $vDictGraph)
                         [count(../*) le count(key('kFindWord',
$pStartWord, $vDictGraph)/../*)]"/>

  <xsl:variable name="vTargetWord" as="xs:string"
                select="($pStartWord, $pTargetWord)[not(. eq
$vStartWord)]"/>

  <!-- This function iterates over the temporary tree


<result><arc level=".." from=".." to=".."/>...</result>
    to find
the
ladder. It starts at a node matching @to with
$vTargetWord

and
proceeds with decreasing @level. -->
  <xsl:function name="my:find-path" as="xs:string*">
    <xsl:param name="root"   as="node()"/>
    <xsl:param name="level"  as="xs:integer"/>
    <xsl:param name="start"  as="xs:string"/>
    <xsl:param name="target" as="xs:string"/>
    <xsl:param name="path"   as="xs:string"/>

    <xsl:for-each select="$root/result/arc[@level = $level and @to =
$target]">
      <xsl:variable name="from" select="./@from"/>
      <xsl:choose>
        <xsl:when test="$start eq $from">
          <xsl:value-of select="concat($from,'+',$path)"/>
        </xsl:when>
        <xsl:otherwise>
          <xsl:value-of select="my:find-path($root,$level
-1,$start,$from,concat($from,'+',$path))"/>
        </xsl:otherwise>
      </xsl:choose>
    </xsl:for-each>
  </xsl:function>

  <xsl:template match="/">
    <xsl:variable name='arcs'>
      <result>
      <xsl:call-template name="look-at-starts">
        <xsl:with-param name="level"  select="1"/>
        <xsl:with-param name="starts" select="$vStartWord"/>
        <xsl:with-param name="target" select="$vTargetWord"/>
        <xsl:with-param name="toskip" select="()"/>
      </xsl:call-template>
      </result>
    </xsl:variable>

    <xsl:variable name="finalArcs" select="$arcs/result/arc[@to =
$vTargetWord]"/>
    <xsl:value-of select="my:find-path($arcs, $finalArcs[1]/@level,
$vStartWord, $vTargetWord, $vTargetWord)"/>
  </xsl:template>

  <!-- Look at $starters nodes obtained from the current set of words

     ending all incomplete ladders. Generate result/arc for each hop
to

     the next step. Recurse if none of the arc destinations is the

overall
     target word, otherwise return the last hop. -->
  <xsl:template name="look-at-starts">
    <xsl:param name="level"  as="xs:integer"/>
    <xsl:param name="starts" as="xs:string*"/>
    <xsl:param name="target" as="xs:string"/>
    <xsl:param name="toskip" as="node()*"/>

    <xsl:variable name="starters" as="node()*"
                  select="key('kFindWord', $starts, $vDictGraph)/..
except $toskip"/>

    <xsl:for-each select="$starters">
      <xsl:variable name="w" select="./w"/>
      <xsl:for-each select="./nb">
        <arc level="{$level}" from="{$w}" to="{.}"/>
      </xsl:for-each>
    </xsl:for-each>

    <xsl:variable name="nbs" select="$starters/nb"/>

    <xsl:choose>
      <xsl:when test="$target = $nbs">
        <!--xsl:message select="'found a ladder'"/-->
      </xsl:when>
      <xsl:otherwise>
        <xsl:call-template name="look-at-starts">
          <xsl:with-param name="level"  select="$level + 1"/>
          <xsl:with-param name="starts"
select="distinct-values($nbs)"/>
          <xsl:with-param name="target" select="$target"/>
          <xsl:with-param name="toskip" select="$toskip union
$starters"/>
        </xsl:call-template>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>
</xsl:stylesheet>



--
Cheers,
Dimitre Novatchev
---------------------------------------
Truly great madness cannot be achieved without significant
intelligence.
---------------------------------------
To invent, you need a good imagination and a pile of junk
-------------------------------------
Never fight an inanimate object
-------------------------------------
To avoid situations in which you might make mistakes may be the
biggest mistake of all
------------------------------------
Quality means doing it right when no one is looking.
-------------------------------------
You've achieved success in your field when you don't know whether what
you're doing is work or play
-------------------------------------
Facts do not cease to exist because they are ignored.
-------------------------------------
Typing monkeys will write all Shakespeare's works in 200yrs.Will they
write all patents, too? :)
-------------------------------------
I finally figured out the only reason to be alive is to enjoy it.



--
Cheers,
Dimitre Novatchev
---------------------------------------
Truly great madness cannot be achieved without significant intelligence.
---------------------------------------
To invent, you need a good imagination and a pile of junk
-------------------------------------
Never fight an inanimate object
-------------------------------------
To avoid situations in which you might make mistakes may be the
biggest mistake of all
------------------------------------
Quality means doing it right when no one is looking.
-------------------------------------
You've achieved success in your field when you don't know whether what
you're doing is work or play
-------------------------------------
Facts do not cease to exist because they are ignored.
-------------------------------------
Typing monkeys will write all Shakespeare's works in 200yrs.Will they
write all patents, too? :)
-------------------------------------
I finally figured out the only reason to be alive is to enjoy it.

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--





--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--




--
Cheers,
Dimitre Novatchev
---------------------------------------
Truly great madness cannot be achieved without significant intelligence.
---------------------------------------
To invent, you need a good imagination and a pile of junk
-------------------------------------
Never fight an inanimate object
-------------------------------------
To avoid situations in which you might make mistakes may be the
biggest mistake of all
------------------------------------
Quality means doing it right when no one is looking.
-------------------------------------
You've achieved success in your field when you don't know whether what
you're doing is work or play
-------------------------------------
Facts do not cease to exist because they are ignored.
-------------------------------------
Typing monkeys will write all Shakespeare's works in 200yrs.Will they
write all patents, too? :)
-------------------------------------
I finally figured out the only reason to be alive is to enjoy it.

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--





--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--



--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--

<Prev in Thread] Current Thread [Next in Thread>