xsl-list
[Top] [All Lists]

Re: [xsl] Processing milestoned XML leads to many preceding:: calls and horrible performance

2012-02-21 03:59:02
This is probably not the answer to the question, but looking at the
source it seems the <vers> elements are numbered, starting at one at
each new <kap> (looked at all 50 kap in
http://mcepl.fedorapeople.org/tmp/zdroje/11_Gn.xml).

So, wouldn't this work:

<xsl:variable name="curPos" select="@n"/>

Regards,
EB


2012/2/21 Matěj Cepl <mcepl(_at_)redhat(_dot_)com>:
Hi,

I am again working on a XSLT stylesheet to convert a Czech Bible translation
from home-brew schema to OSIS and I got to some performance problems.

Whole stylesheet is
https://gitorious.org/sword/czekms-csp_bible/blobs/master/CEP2OSIS.xsl (and
git repo can be clone from ...), but I believe the relevant parts are

   <xsl:template name="genRef">
       <xsl:variable name="refKniha" select="//kniha[1]/@jmeno"/>
       <xsl:variable name="refKapitola" select="preceding::kap[1]/@n"/>
       <xsl:value-of select="concat($refKniha,'.',$refKapitola,'.')"/>
   </xsl:template>

   <xsl:template name="endVerse">
       <xsl:param name="rBase" />
       <xsl:element name="verse">
           <xsl:variable name="prevVerseID">
               <xsl:value-of select="./preceding::vers[1]/@n" />
           </xsl:variable>
           <xsl:attribute name="eID">
               <xsl:value-of select="concat($rBase,$prevVerseID)" />
           </xsl:attribute>
       </xsl:element>
   </xsl:template>

   <!-- ... -->

   <xsl:template match="vers">
       <xsl:variable name="refBase">
           <xsl:call-template name="genRef" />
       </xsl:variable>
       <xsl:variable name="refID" select="concat($refBase,./@n)" />
       <!-- Find out whether this is a first verse in a chapter; notice that
<kap/> element is milestoned as well,
       so we have to count a distance in <verse/> elements from it, rather
than use plain count() -->
       <xsl:variable name="curPos"

select="count(./preceding::kap[1]/following::*[not(count(preceding-sibling::vers|current())
= count(preceding-sibling::vers))])" />
       <xsl:if test="not($curPos=1)">
           <xsl:call-template name="endVerse">
               <xsl:with-param name="rBase">
                   <xsl:value-of select="$refBase" />
               </xsl:with-param>
           </xsl:call-template>
       </xsl:if>
       <xsl:element name="verse">
           <xsl:attribute name="sID">
                   <xsl:value-of select="$refID" />
               </xsl:attribute>
           <xsl:attribute name="osisID">
                   <xsl:value-of select="$refID" />
               </xsl:attribute>
       </xsl:element>
   </xsl:template>

This works (at least as much as I was able to test it give then the
circumstances), but the performance is absolutely dreadful. Just book of
Genesis took almost an hour before being processed (with one core of my
dual-core CPU being constantly at 100%).

Obviously the problem is that <xsl:variable name="curPos"/>, and I read
about how preceding* axes are horribly inefficient all over the Internet,
but unfortunately I haven't figured out any other way how to do what I am
doing and most laments about preceding* axes don't provide much hints
either.

The problem is (I think) in both <vers/> (that's "verse" in Czech) and
<kap/> (that's an abbreviation of "chapter") are just milestones, so I have
to go through all verses in whole book all the time (yes, this is
http://www.joelonsoftware.com/articles/fog0000000319.html all over again).

Any ideas? Would some other XSLT processors other than xsltproc (libxml
20706, libxslt 10126 and libexslt 815) I am using be able to optimize this
somehow?

Thanks a lot,

Matěj

--
http://www.ceplovi.cz/matej/, Jabber: mcepl<at>ceplovi.cz
GPG Finger: 89EF 4BC6 288A BF43 1BAB  25C3 E09F EF25 D964 84AC

в чужой монастырь со своим уставом не ходят.
   -- Russian proverb (this time actually checked by a native
      Russian)


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--


<Prev in Thread] Current Thread [Next in Thread>