xsl-list
[Top] [All Lists]

Re: [xsl] Stylesheet Optimization -- How to Make It Faster

2006-11-28 05:56:05
sorry for the messy sample files... my mail client removed the tabs.

I'm using saxon 8.8j

i already used keys upon your suggestion, however i did not notice a change in the processing time, but i'll test more files just to be sure.

here's now my new xsl

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"; xmlns:xs="http://www.w3.org/2001/XMLSchema"; xmlns:ati="http://www.asiatype.com/xslt-functions"; exclude-result-prefixes="xs ati">
 <xsl:output method="xml" version="1.0" encoding="UTF-8"/>
<xsl:variable name="abbreviations" as="element()+" select="document('publishers_data.xml')/root/publisher/abbrev"/>
 <xsl:key name="abbrev" match="expanded" use="preceding-sibling::abbrev"/>
 <xsl:template match="/">
    <xsl:apply-templates/>
 </xsl:template>
<xsl:template match="text()[ancestor::ab and not(ancestor::note[(_at_)id and @n and @lang])][exists($abbreviations[matches(current(),concat('(^|\W)(',ati:escape(.),')($|\W)'))])]">
     <xsl:variable name="str" as="xs:string" select="."/>
<xsl:variable name="search-str" as="xs:string+" select="$abbreviations[matches($str,concat('(^|\W)(',ati:escape(.),')($|\W)'))]"/>
     <xsl:variable name="replace" as="element()*">
          <xsl:for-each select="$search-str">
              <xsl:variable name="abbr" as="xs:string" select="."/>
<abbr type="title" expand="{$abbreviations/key('abbrev', $abbr)}">
                  <xsl:value-of select="$abbr"/>
              </abbr>
          </xsl:for-each>
      </xsl:variable>
<xsl:sequence select="ati:replace-with-nodes($str, $search-str, $replace)"/>
   </xsl:template>
<xsl:template match="@*|element()|comment()|processing-instruction()" mode="#all">
       <xsl:copy>
           <xsl:apply-templates select="@*|node()"/>
       </xsl:copy>
   </xsl:template>
   <xsl:function name="ati:replace-with-nodes" as="node()+">
        <xsl:param name="input" as="xs:string"/>
        <xsl:param name="words-to-replace" as="xs:string*"/>
        <xsl:param name="replacement" as="node()*"/>
<xsl:variable name="regex" select="string-join(for $w in $words-to-replace return concat('(', ati:escape($w), ')'),'|')"/>
        <xsl:analyze-string select="$input" regex="{$regex}">
<xsl:matching-substring> <xsl:variable name="i" as="xs:integer" select="(1 to count($words-to-replace))[regex-group(.)]"/>
                  <xsl:sequence select="$replacement[$i]"/>
             </xsl:matching-substring>
         <xsl:non-matching-substring>
             <xsl:value-of select="."/>
         </xsl:non-matching-substring>
       </xsl:analyze-string>
    </xsl:function>
    <xsl:function name="ati:escape">
       <xsl:param name="s" as="xs:string"/>
<xsl:sequence select="replace($s,'[\\\|\.\-\^\?\*\+\(\)\{\}\[\]\$]','\\$0')"/>
    </xsl:function>
</xsl:stylesheet>
heres a short version of the publishers_data.xml:

<root>
<publisher>
<abbrev>Inschriften von Priene</abbrev>
<expanded>Inschriften von Priene</expanded> </publisher> <publisher> <abbrev>P. Mil. Congr. XVIII</abbrev> <expanded>Papiri documentari dell'UniversitàCattolica di Milano</expanded> </publisher> <publisher> <abbrev>P. Jud. Des. Misc.</abbrev> <expanded>Discoveries in the Judean Desert XXXVIII</expanded> </publisher>
<!-- more publishers here -->
</root>

heres a snippet of the source xml:

<!-- preceding::node() of ab -->
<ab lang="grk" n="1">
<foreign lang="grk">· γέγονε κατὰ τοὺς Δαρείου</foreign> <note place="margin">a c</note> <lb n="5"/> <foreign lang="grk">χρόνους τοῦ μετὰ Καμβύσην βασιλεύσαντος, ὅτε καὶ Διονύσιος ἦν ὁ Μιλήσιος</foreign> <lb/>(III), <foreign lang="grk">ἐπὶ τῆς ξ¯ε¯ ὀλυμπιάδος</foreign> (520/16)<foreign lang="grk">· ἱστοριογράφος. ῾Ηρόδοτος δὲ ὁ ῾Αλι-</foreign> <note place="margin">v</note> <lb/> <foreign lang="grk">καρνασεὺς ὠφέληται τούτου, νεώτερος ὤν. καὶ ἦν ἀκουστὴς Πρωταγόρου</foreign> <note id="n7" n="7" lang="ger"> <foreign lang="grk">ὤν· γέγονε γὰρ μετ᾽ αὐτόν</foreign> A</note> <lb/> <foreign lang="grk">ὁ ῾Εκαταῖος. πρῶτος δὲ ἱστορίαν πεζῶς ἐξήνεγκε, συγγραφὴν δὲ Φερεκύδης</foreign> <note id="n8—9" n="8—9" lang="ger"> <foreign lang="grk">πρῶτος—νοθεύεται</foreign> wiederholt s. <foreign lang="grk">ὶστορῆσαι</foreign>, s. <foreign lang="grk">συγγραφεῖς</foreign>.</note> <lb/>(I 3). <foreign lang="grk">τὰ γὰρ ᾽Ακουσιλάου</foreign> (<link type="boj" targets="a002" n="BOJTEXT002_T_7">2 T 7</link>) <foreign lang="grk">νοθεύεται.</foreign> <note id="n9" n="9" lang="ger"> <foreign lang="grk">᾽Ακουσιλάου</foreign> Vossius <foreign lang="grk">᾽Αγησιλάου</foreign> Suid</note> </ab>
<!-- following::node() of ab -->

all: ab nodes appear in the same level (same depth) though out.

Any suggestions are welcome.

Thanks,
--
Jeff

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--



--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--




--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--

<Prev in Thread] Current Thread [Next in Thread>