Re: [xsl] problem with transforming mixed content
2020-08-15 12:46:59
The real difficulties emerge when someone, maybe a word processor, puts
the colon in an i element (or other arbitrarily deeply nested markup)
while it is still supposed to serve as a separator. Then my "upward
projection" solution will still work. And it's easily configurable,
thanks to xsl:evaluate.
But I agree with you and Dimitre. A one-off solution that isn't
unnecessarily complex is often more appropriate and instructive.
Maybe Tommie can include in the xsl-list instructions something along
these lines:
"If you ask for a solution, indicate whether you prefer just a sketch or
actual code. And if you ask for code, indicate whether it should only
narrowly cover the presented use case or whether it should be robust
against complexities that are present in the real-life input. If the
input that you give is simplified (which it usually is), try to indicate
in which regards the actual input is different."
Gerrit
On 15.08.2020 18:43, Graydon graydon(_at_)marost(_dot_)ca wrote:
On Sat, Aug 15, 2020 at 04:03:26PM -0000, Wolfhart Totschnig
wolfhart(_dot_)totschnig(_at_)mail(_dot_)udp(_dot_)cl scripsit:
And thank you, Michael, for the detailed explanation of possible approaches
to the problem. Graydon's solution will work very well in my case, I think,
since I can test for most error-producing conditions before applying the
code and the probability of further errors seems sufficiently low in my
context and for my purposes. But I am still curious: What would an approach
of type (a) look like in my case? It seems to me that implementing this
approach would again face the original problem: "turning the punctuation
into markup" sounds like a description of the original problem.
I tend to make a distinction between conversion code -- I'm going to do
this once, for this exact data set -- and production code -- the code
has to go deal with the world indefinitely. Approach a) is definitely
more like production code than approach b).
Dr. Kay is completely correct that the approach b) solution I provided
has a lot of ways to fail. In a one-time data conversion context, when
I've already used XQuery to find out exactly what's in there, I wouldn't
worry about that. I'm trying to minimize the effort required to write
the one-use conversion code.
In a production context -- the code must go fend for itself -- approach
a) wins.
For your example, approach a) would look like:
<xsl:stylesheet exclude-result-prefixes="xs math xd" version="3.0"
xmlns:math="http://www.w3.org/2005/xpath-functions/math"
xmlns:xd="http://www.oxygenxml.com/ns/doc/xsl"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xd:doc scope="stylesheet">
<xd:desc>
<xd:p><xd:b>Created on:</xd:b> Aug 15, 2020</xd:p>
<xd:p><xd:b>Author:</xd:b> graydon</xd:p>
<xd:p />
</xd:desc>
</xd:doc>
<xsl:variable as="element(title)+" name="test">
<title>THE TITLE OF THE BOOK WITH SOME <i>ITALICS</i> AND SOME MORE WORDS:
THE SUBTITLE OF THE BOOK WITH SOME
<i>ITALICS</i></title>
</xsl:variable>
<xd:doc>
<xd:desc>test</xd:desc>
</xd:doc>
<xsl:template name="xsl:initial-template">
<bucket>
<xsl:for-each select="$test">
<xsl:variable as="element()" name="temp1">
<xsl:apply-templates mode="marker" select="." />
</xsl:variable>
<xsl:variable as="element()+" name="temp2">
<xsl:apply-templates mode="split" select="$temp1" />
</xsl:variable>
<xsl:sequence select="$temp2" />
</xsl:for-each>
</bucket>
</xsl:template>
<xsl:mode name="marker" on-no-match="shallow-copy" />
<xsl:mode name="split" on-no-match="shallow-copy" />
<xsl:mode name="recase" on-no-match="shallow-copy" />
<xd:doc>
<xd:desc>place the separator marker element; this can get much more involved if
you aren't sure you certainly have a single colon in a text node</xd:desc>
</xd:doc>
<xsl:template match="text()[contains(., ':')]" mode="marker">
<xsl:value-of select="substring-before(., ':')" />
<title-separator />
<xsl:value-of select="substring-after(., ':') => replace('^\p{Zs}+', '')"
/>
</xsl:template>
<xd:doc>
<xd:desc>divide title into title and subtitle</xd:desc>
</xd:doc>
<xsl:template match="title" mode="split">
<title>
<xsl:apply-templates mode="recase"
select="descendant::node()[following::title-separator]" />
</title>
<subtitle>
<xsl:apply-templates mode="recase"
select="descendant::node()[preceding::title-separator]" />
</subtitle>
</xsl:template>
<xd:doc>
<xd:desc>lower case all the text nodes in title</xd:desc>
</xd:doc>
<xsl:template match="text()" mode="recase">
<xsl:sequence select="lower-case(.)" />
</xsl:template>
</xsl:stylesheet>
You could add more <title/> elements to $test and test all of them as
you find problematic title elements in the content set.
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--
<Prev in Thread] |
Current Thread |
[Next in Thread>
|
- Re: [xsl] problem with transforming mixed content, (continued)
Re: [xsl] problem with transforming mixed content, Michael Kay mike(_at_)saxonica(_dot_)com
- Re: [xsl] problem with transforming mixed content, Wolfhart Totschnig wolfhart(_dot_)totschnig(_at_)mail(_dot_)udp(_dot_)cl
- Re: [xsl] problem with transforming mixed content, Martin Honnen martin(_dot_)honnen(_at_)gmx(_dot_)de
- Re: [xsl] problem with transforming mixed content, Imsieke, Gerrit, le-tex gerrit(_dot_)imsieke(_at_)le-tex(_dot_)de
- Re: [xsl] problem with transforming mixed content, Graydon graydon(_at_)marost(_dot_)ca
- Re: [xsl] problem with transforming mixed content,
Imsieke, Gerrit, le-tex gerrit(_dot_)imsieke(_at_)le-tex(_dot_)de <=
- Re: [xsl] problem with transforming mixed content, Graydon graydon(_at_)marost(_dot_)ca
- [xsl] Specifying Response Style [Was: problem with transforming mixed content}], B Tommie Usdin btusdin(_at_)mulberrytech(_dot_)com
- Re: [xsl] Specifying Response Style [Was: problem with transforming mixed content}], Imsieke, Gerrit, le-tex gerrit(_dot_)imsieke(_at_)le-tex(_dot_)de
Re: [xsl] problem with transforming mixed content, Wolfhart Totschnig wolfhart(_dot_)totschnig(_at_)mail(_dot_)udp(_dot_)cl
Re: [xsl] problem with transforming mixed content, Mukul Gandhi gandhi(_dot_)mukul(_at_)gmail(_dot_)com
Re: [xsl] problem with transforming mixed content, Wolfhart Totschnig wolfhart(_dot_)totschnig(_at_)mail(_dot_)udp(_dot_)cl
Re: [xsl] problem with transforming mixed content, Dimitre Novatchev dnovatchev(_at_)gmail(_dot_)com
Re: [xsl] problem with transforming mixed content, Mukul Gandhi gandhi(_dot_)mukul(_at_)gmail(_dot_)com
|
|
|