Coming late to this thread, and I think everything has pretty much been
said, but had a query.
I notice that most XSLT 1.0 solutions use recursive templates and wondered
if there is any benefit in a solution that re-applies templates to text
nodes many times, rather than explicitly calling a recursive template. For
moderately big files it seems to perform quite well, though I haven't tested
it too much, so there may be hidden problems. I guess this approach is
essentially the same as recursion, but you don't have to figure out which
matched string comes first (though you do need node-set).
Mark_up_text.xml:
<node>
<para>
This is a sample document that deals with markup of <emph>text</emph>.
</para>
<para> When one applies <emph>markup</emph> to a large document, one
is faced with
a <def>time-consuming</def> effort.
</para>
<para att="document markup">lkj markup kjlkj document lkj document
;lkj markup lkj;slakfj markup document</para>
</node>
Mark_up_text.xsl:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxml="urn:schemas-microsoft-com:xslt">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="text()" priority="2"><!-- need priority to overcome the
node match below -->
<xsl:call-template name="markup">
<xsl:with-param name="text" select="."/>
</xsl:call-template>
</xsl:template>
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="@*"/>
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>
<xsl:template name="markup">
<xsl:param name="text"/>
<xsl:choose>
<xsl:when test="contains($text, 'document')">
<xsl:apply-templates
select="msxml:node-set(substring-before($text,'document'))"/>
<xsl:element name="special">document</xsl:element>
<xsl:apply-templates
select="msxml:node-set(substring-after($text,'document'))"/>
</xsl:when>
<xsl:when test="contains($text, 'markup')">
<xsl:apply-templates
select="msxml:node-set(substring-before($text,'markup'))"/>
<xsl:element name="special">markup</xsl:element>
<xsl:apply-templates
select="msxml:node-set(substring-after($text,'markup'))"/>
</xsl:when>
<xsl:otherwise><xsl:value-of select="."/></xsl:otherwise>
</xsl:choose>
</xsl:template>
And my output is:
<node>
<para>
This is a sample <special>document</special> that deals with
<special>markup</special> of <emph>text</emph>. </para>
<para> When one applies <emph><special>markup</special></emph> to a large
<special>document</special>, one is faced with
a <def>time-consuming</def> effort.
</para>
<para att="document markup">lkj <special>markup</special> kjlkj
<special>document</special> lkj <special>document</special> ;lkj
<special>markup</special> lkj;slakfj <special>markup</special>
<special>document</special></para>
</node>
Thanks,
David.
--
David McNally Moody's Investors Service
Software Engineer 99 Church St, NY NY 10007
David(_dot_)McNally(_at_)Moodys(_dot_)com (212) 553-7475
-----Original Message-----
From: Jim Melton [mailto:jim(_dot_)melton(_at_)acm(_dot_)org]
Sent: Thursday, July 03, 2003 4:28 PM
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Cc: jim(_dot_)melton(_at_)acm(_dot_)org
Subject: [xsl] Using XSLT to add markup to a document
Gentlepeople,
I'm struggling with a problem that I fear isn't easily solved
with XSLT,
but there are many experts on this list who might be able to
help. The
brief summary of my problem is that I want to find certain words that
appear in paragraphs throughout a very large (XML) document
and mark up
those words without making any other changes to my document.
For example, consider a document with the following fragment:
<para>
This is a sample document that deals with markup of
<emph>text</emph>. </para> <para> When one applies
<emph>markup</emph> to a large document, one is faced with
a <def>time-consuming</def> effort.
</para>
If one of the words to which I wish to apply markup is
"markup" and another
is "document", then I would want the result to be something like this:
<para>
This is a sample <special>document</special> that deals with
<special>markup</special> of <emph>text</emph>.
</para>
<para>
When one applies <emph><special>markup</special></emph> to a large
<special>document</special>, one is faced with a
<def>time-consuming</def>
effort.
</para>
As you see from this example, I want to *add* markup to the
words I have
found where they appear in my result tree, but copy
everything else in my
document to the output tree unchanged.
I tend to use Saxon (currently using 6.5.2) as my primary
XSLT engine, but
I also have Microsoft's MSXML 4.0 (and could undoubtedly find
others if
required to do so).
Any guidance or advice?
Many thanks,
Jim
==============================================================
==========
Jim Melton --- Editor of ISO/IEC 9075-* (SQL) Phone:
+1.801.942.0144
Oracle Corporation Oracle Email:
mailto:jim(_dot_)melton(_at_)oracle(_dot_)com
1930 Viscounti Drive
Standards email: mailto:jim(_dot_)melton(_at_)acm(_dot_)org
Sandy, UT 84093-1063 Personal email:
mailto:jim(_at_)melton(_dot_)name
USA Fax : +1.801.942.3345
========================================================================
= Facts are facts. However, any opinions expressed are the opinions =
= only of myself and may or may not reflect the opinions of anybody =
= else with whom I may or may not have discussed the issues at hand. =
========================================================================
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
---------------------------------------
The information contained in this e-mail message, and any attachment thereto,
is confidential and may not be disclosed without our express permission. If
you are not the intended recipient or an employee or agent responsible for
delivering this message to the intended recipient, you are hereby notified that
you have received this message in error and that any review, dissemination,
distribution or copying of this message, or any attachment thereto, in whole or
in part, is strictly prohibited. If you have received this message in error,
please immediately notify us by telephone, fax or e-mail and delete the message
and all of its attachments. Thank you.
Every effort is made to keep our network free from viruses. You should,
however, review this e-mail message, as well as any attachment thereto, for
viruses. We take no responsibility and have no liability for any computer
virus which may be transferred via this e-mail message.
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list