Hello,
I have 2 questions:
1) I have a specific requirement where I am bit struck with what would be
the best way to handle it. In a nutshell, I need to modify the source
<p>
text text ‘ LINK-1 TEXT ’ TEXT TEXT <URL
weburl="XXX">XXX</url> TEXT
<SOmething>TEXT</SOmething>
AND again
<INSIDE>SOME TEXT text ‘ LINK-2 TEXT ’ TEXT
<URL weburl="YYY">YYY</url></INSIDE>
And can be more text with or without URL and TEXT like ‘ LINK-3
TEXT’
</p>
to (THE REQUIREMENT)
<p>
text text <a href="XXX"> LINK-1 TEXT </a> TEXT TEXT TEXT
<SOmething>TEXT</SOmething>
AND again <another>SOME TEXT text <a href="XXX"> LINK-2 TEXT </a> TEXT
<another>
And can be more text with or without URL and TEXT like ‘ LINK-3
TEXT’
</p>
What
it required is, for each <URL>, if the PRECEDING part of string
had text contained within ‘ and ’, then they mut
be converted to <a href> link. For me, after narrowing down to
p[URL], not sure what would be the best pattern to achieve the desired
result. Pls can you suggest something? In the above sample, NOTE that
the last set of ‘ LINK-3 TEXT’ was left as it is
due to no matching URL. Even though XSL1 used, if XSL2 can solve it
easily, pls suggest that also.
[SAMPLE Skeleton XML and XSL]
XML:
<?xml version="1.0"
encoding="UTF-8"?>
<root>
<something>
<blah-blah>Can have many child</blah-blah>
<nodeGroup>
<note id="does-not-matter-1">
<p>
<something><sup>1</sup></something>
some text here. <bidItem id="95522-1" vol="1"> Title Name,
Other details,
‘The
arms trade and corruption’, <i>Prospect</i>
Aug.2005</bidItem>.
<!-- NOTE: NO URL IN THIS CASE, WHICH IS FINE -->
</p>
</note>
<note id="does-not-matter-2">
<p> some text
‘Ex-Pentagon procurement executive gets jail time’, text text
<
<url
webUrl="http://www.aaa.xx/bbb/ddd.htm">http://www.aaa.xx/bbb/ddd.htm</url>>;
‘Former Air Force acquisition official released from
jail’, Government in 2005, <
<url webUrl="http://www.aaa.xx/bbb/uuu.htm">SAME AS
@webUrl</url>>; and
<bidItem id="95522-2">Author name., ‘Cashing in for
profit? Who cost taxpayers
billions in biggest Pentagon scandal in years?’, <i>60 Minutes</i>,
CBS, 5 Jan. 2005
</bidItem>, < <url
webUrl="http://www.cbsnews.com/stories/2005/01/04/60II/main664652.shtml">SAME
AS @webUrl</url>>.
<!-- HERE EACH URL HAS MATCHING ‘contens’
WHICH IS FINE -->
</p>
</note>
<note id="does-not-matter-3">
<p><something><sup>68</sup></something> This figure is
comprised of a fine of
£500 000 ($900 000) for ‘irregular
accounting practices’
in a Tanzanian deal for an inappropriate and overpriced air
radar system that was
tainted by allegations of high-level corruption, with
...($405 000)
costs..
£29.275 million ($52.695 million) going to Tanzania in
reparations. <bidItem
id="996522-31" title="BAE deal with Tanzania...">Evans, R. and
Lewis, P., ‘BAE deal with Tanzania:
military air traffic control—for country with no
airforce’, <i>The
Guardian</i>, 6 Feb. 2010</bidItem>; ‘Military
radar probe: the key suspects … and
the case against them’, <i>This Day</i> (Dar es
Salaam), 15 Feb. 2010; <
<url
webUrl="http://www.judiciary.gov.uk/Resources/JCO/Documents/Judgments/r-v-bae-sentencing-remarks.pdf">SAME
AS @webUrl</url>>.
<!--
ONLY ONE URL, BUT MANY ‘ in-between texts
’
So, the URL belong only to its preceding "‘
in-between texts ’"
-->
</p>
</note>
</nodeGroup>
</something>
XSL:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match="/">
<xsl:apply-templates select="*"/>
</xsl:template>
<xsl:template match="*">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:variable name="href-start"><href="</xsl:variable>
<xsl:variable name="href-mid">"/></xsl:variable>
<xsl:variable name="href-finish"><a/></xsl:variable>
<xsl:template match="note">
<xsl:copy>
<xsl:apply-templates
select="@*"/>
<xsl:apply-templates mode="url"/>
</xsl:copy>
</xsl:template>
<xsl:template match="p[url]" mode="url">
<!-- HERE, FOR EACH URL, IT SHOULD FORM A HREF LINK, COVERING ANY
PRECEDING TEXT THAT APPEAR
IN-BETWEEN ‘ AND ’
Ref: MAIL
DESCRIPTION.
-->
<xsl:copy>
<xsl:apply-templates select="@*"/>
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>
<xsl:template match="p[not(url)]" mode="url">
<xsl:copy>
<xsl:apply-templates select="@*"/>
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>
<xsl:template match="@*|text()|comment()|processing-instruction()">
<xsl:copy-of select="."/>
</xsl:template>
<!-- COMMENTED... SOME TRY ALONG THIS LINE
<xsl:template .... mode="url">
<xsl:copy>
<xsl:... test="contains(., '‘')">
<!-\-<xsl:apply-templates>
<xsl:sort select="substring-before(., '‘')"/>
</xsl:apply-templates>-\->
<xsl:value-of select="substring-before(., '‘')"/>
<xsl:value-of select="$href-start"
disable-output-escaping="yes"/>[@<xsl:value-of
select="following-sibling::url"/>]<xsl:value-of select="$href-mid"
disable-output-escaping="yes"/>
<xsl:value-of select="substring-after(., '‘')"/>
</xsl:...>
<xsl:... test="contains(., '’')">
<xsl:value-of select="substring-before(.,
'’')"/>
<xsl:value-of select="$href-finish"
disable-output-escaping="yes"/>
<xsl:value-of select="substring-after(., '’')"/>
</xsl:..>
<xsl:apply-templates .... mode="url"/>
</xsl:copy>
</xsl:template>
-->
</xsl:stylesheet>
2) Additionally, when dealing with
such mixed content (I mean containing both text and child elements),
what is the best way to split and handle elements and text seperately?
Thanks and look forward to suggestions,
Karl
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--