a kusa wrote:
Hello
There is a requirement to search for a particular pattern in XML
documents and replace them by reading another XML file and copying
over the replacement text correcpinding to the original text. I have
been trying to use<xsl:analyze-string> in xslt 2.0. but I am not sure
how to read another XML file using this tag.
As an example, if I have some text tagged within<para> tags :
<para> this is a simple text</para>
I have an external xml file of the form:
<matchtext>simple</matchtext>
<replacetext>hard</replacetext>
In my<xsl:matching-substring>, can I use doc() to read the external
XML file and replace the text?
Yes, simply build a regular expression and use that. Here is a sample:
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xsd"
version="2.0">
<xsl:param name="rep-file" as="xsd:string"
select="'test2011032302.xml'"/>
<xsl:variable name="rep-doc" as="document-node()"
select="doc($rep-file)"/>
<xsl:variable name="rep-pattern" as="xsd:string"
select="string-join($rep-doc/replacements/replacement/matchtext,
'|')"/>
<xsl:key name="rep-key" match="replacement" use="matchtext"/>
<xsl:template match="para">
<xsl:copy>
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>
<xsl:template match="para//text()">
<xsl:analyze-string select="." regex="{$rep-pattern}">
<xsl:matching-substring>
<xsl:value-of select="key('rep-key', ., $rep-doc)/replacetext"/>
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:value-of select="."/>
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:template>
</xsl:stylesheet>
Assumes you have a file test2011032302.xml
<replacements>
<replacement>
<matchtext>simple</matchtext>
<replacetext>hard</replacetext>
</replacement>
</replacements>
There are some shortcomings, namely that word boundaries like \b are not
supported by the XSLT/XPath regular expression language so it is
difficult to prevent that e.g. "simple" in "simpleminds" is not
replaced. If your XSLT 2.0 processor is AltovaXML Tools then I think it
supports \b however.
Another problem occurs if the matchtext contains characters that are
meta character in regular expressions like '?' or ')', you would first
need to escape them with a function like
http://www.xsltfunctions.com/xsl/functx_escape-for-regex.html.
--
Martin Honnen
http://msmvps.com/blogs/martin_honnen/
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--