On Sat, Aug 15, 2020 at 7:46 AM Wolfhart Totschnig
wolfhart(_dot_)totschnig(_at_)mail(_dot_)udp(_dot_)cl
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com>
wrote:
Dear list,
I would like to ask for your help with the following mixed-content
problem. I am receiving, from an external source, data in the following
form:
<title>THE TITLE OF THE BOOK WITH SOME <i>ITALICS</i> AND SOME MORE
WORDS: THE SUBTITLE OF THE BOOK WITH SOME <i>ITALICS</i></title>
What I would like to do is
1) separate the title from the subtitle (i.e., divide the data at the
colon) and put each in a separate element node;
2) all the while maintaining the <i> markup;
3) and perform certain string manipulations on all of the text nodes;
for the purposes of this post, I will use the example of converting
upper-case to lower-case.
So the desired output is the following:
<title>the title of the book with some <i>italics</i> and some more
words</title>
<subtitle>the subtitle of the book with some <i>italics</i></subtitle>
How can this be done?
I know that I can perform string manipulations while maintaining the <i>
markup with templates, i.e., <xsl:template match="text()"/> and
<xsl:template match="i"/>. But in this case I do not know how to divide
the data at the colon. And I know that I can divide the data at the
colon with <xsl:value-of select="substring-before(.,': ')"/>, but then I
loose the <i> markup. So I am at a loss.
I've come up with following XSLT transform, which seems to work for this
use case,
<xsl:stylesheet version="3.0" xmlns:xsl="
http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="title">
<result>
<xsl:variable name="result_pass1" as="xs:string*">
<xsl:apply-templates select="node()" mode="pass1"/>
</xsl:variable>
<title>
<xsl:for-each
select="tokenize(normalize-space(substring-before(string-join($result_pass1,
''), ':')), '##')">
<xsl:call-template name="process_tokenize_result_item">
<xsl:with-param name="inpStr" select="."/>
</xsl:call-template>
</xsl:for-each>
</title>
<subtitle>
<xsl:for-each
select="tokenize(normalize-space(substring-after(string-join($result_pass1,
''), ':')), '##')">
<xsl:call-template name="process_tokenize_result_item">
<xsl:with-param name="inpStr" select="."/>
</xsl:call-template>
</xsl:for-each>
</subtitle>
</result>
</xsl:template>
<xsl:template name="process_tokenize_result_item">
<xsl:param name="inpStr" as="xs:string"/>
<xsl:choose>
<xsl:when test="position() mod 2 = 0">
<i>
<xsl:value-of select="."/>
</i>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="."/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template match="node()" mode="pass1">
<xsl:choose>
<xsl:when test="self::i">
<xsl:value-of select="concat('##', lower-case(.), '##')"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="lower-case(.)"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
The above XSLT transform, when provided following XML input document,
<title>THE TITLE OF THE BOOK WITH SOME <i>ITALICS</i> AND SOME MORE
WORDS: THE SUBTITLE OF THE BOOK WITH SOME <i>ITALICS</i></title>
produces following result,
<result>
<title>the title of the book with some <i>italics</i> and some more
words</title>
<subtitle>the subtitle of the book with some <i>italics</i>
</subtitle>
</result>
This solution, follows a two pass approach. In the first pass, the element
constructs <i>text</i> are transformed into ##text## (assuming that
delimiter ## doesn't interfere with the input text). The result of
pass one, is transformed into the final result by second pass.
--
Regards,
Mukul Gandhi
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--