xsl-list
[Top] [All Lists]

Re: [xsl] Processing two documents, which order?

2011-04-07 09:47:42
You'll probably need to run some tests to verify performance of
various approaches, but my hunch would be to combine the list of words
into a single regex, let the regex implementation optimize it and do a
single pass over the document.

Based on Michael's comments, you probably want to build the regex in a
global variable or, if Saxon doesn't recognize that this would mean
the regex is the same every time through, dynamically build the final
transform that will actually process the document (2) in a separate
run.

-Brandon :)


On Thu, Apr 7, 2011 at 9:25 AM, Dave Pawson 
<davep(_at_)dpawson(_dot_)co(_dot_)uk> wrote:


I have two xml documents.
The first is a list of marked up words (1),
the second a 'normal' xml document (2)

For each occurrence in 2 of a word from 1
I need to mark up the word with <property> </property>

Which order is anywhere near optimum?
Document 1 has about 300 words,
Document 2 is 33,000 lines.

This is the template to do the work

<xsl:template match="*">
   <xsl:param name="property" as="xs:string"/>
   <xsl:analyze-string select="." regex="({$property})[\s\p{{P}}]">
     <xsl:matching-substring>
<!--    <xsl:message>match on [<xsl:value-of
select='regex-group(1)'/>]</xsl:message> -->
<property><xsl:value-of
select="regex-group(1)"/></property> </xsl:matching-substring>
     <xsl:non-matching-substring>
       <xsl:copy-of select="."/>
     </xsl:non-matching-substring>
   </xsl:analyze-string>
 </xsl:template>

but I'm hesitating as to which loop sequence will work best?


--

regards

--
Dave Pawson
XSLT XSL-FO FAQ.
http://www.dpawson.co.uk

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--



--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--