Hello Chris,
Chris Loschen wrote:
I have some mixed content paragraphs which I need to break up into
multiple paragraphs, each one beginning with a specified string. See my
sample input and desired output below. (I'll clean up some of the other
cruft like the extra br elements later.) I tried to set up some
templates (which I've included below) for this based on the example in
Michael Kay's XSLT 2nd ed., p. 550. However, Mr. Kay's script was
dealing with a pure string of #PCDATA and I've got nodes mixed in, so
I'm getting an error that I cannot "convert #STRING to a NodeList!" (If
I've misunderstood any of that, please let me know.) What am I doing
wrong? Or do I need to totally rethink my approach, because
"substring-before" and "substring-after" are meant to work on strings,
not mixed content?
Yes! It's XML, nodes, elements, ... and not simply a text string.
If that's it, is there a way to accomplish what I need to do?
It's a grouping problem.
Using this XSLT:
<xsl:key name="nodes" match="/p/node()" use="generate-id((../node()[1] |
(preceding-sibling::span|self::span)[starts-with(.,
'&(!!char1!!);')])[last()])"/>
<xsl:template match="/p">
<root>
<xsl:apply-templates select="node()[1] | span[starts-with(.,
'&(!!char1!!);')]" mode="start-group"/>
</root>
</xsl:template>
<xsl:template match="node()" mode="start-group">
<p class="extract-9">
<xsl:copy-of select="key('nodes', generate-id())"/>
</p>
</xsl:template>
I get an output, which looks at least similar to your desired output.
The nodes are grouped by their preceding-sibling::span containing
'char1'. If the "current" node itself is a 'char1'-span element, it must
not use preceding-sibling, but itself. And because it seems to be
possible, that the first node doesn't contain 'char1', you must access
it by hand. This results in
(../node()[1] |
(preceding-sibling::span | self::span)
[starts-with(.,'&(!!char1!!);')])
[last()]
Everything clear?
Regards,
Joerg
Thanks again for all your previous assistance, and thank you in advance
for helping me with this one.
SAMPLE INPUT:
<p class="extract-9"><span class="extract-7"><b>Did You
Know?</b></span>How to read a car ad:<span>&(!!char1!!);</span>
<i>Low mileage</i> means <i>the odometer doesn’t work<br
/></i><span>&(!!char1!!);</span> <i>All original</i> means <i>needs
new everything<br /></i><span>&(!!char1!!);</span> <i>Health forces
sale</i> means <i>I’m sick of this car<br
/></i><span>&(!!char1!!);</span> <i>Must see</i> means <i>I
won’t put anything in writing<br
/></i><span>&(!!char1!!);</span> <i>Runs like a top</i> means
<i>wobbles when driven slowly<br /></i><span>&(!!char1!!);</span>
<i>Mint</i> means <i>there’s an old roll of Lifesavers under the
seat<br /></i><span>&(!!char1!!);</span> <i>Rare</i> means <i>most
examples of this model fell apart long ago</i></p>
DESIRED OUTPUT:
<p class="extract-9"><span class="extract-7"><b>Did You
Know?</b></span>How to read a car ad:</p>
<p class="extract-9"><span>&(!!char1!!);</span> <i>Low mileage</i>
means <i>the odometer doesn’t work<br /></i></p>
<p class="extract-9"><span>&(!!char1!!);</span> <i>All original</i>
means <i>needs new everything<br /></i></p>
<p class="extract-9"><span>&(!!char1!!);</span> <i>Health forces
sale</i> means <i>I’m sick of this car<br /></i></p>
<p class="extract-9"><span>&(!!char1!!);</span> <i>Must see</i>
means <i>I won’t put anything in writing<br /></i></p>
<p class="extract-9"><span>&(!!char1!!);</span> <i>Runs like a
top</i> means <i>wobbles when driven slowly<br /></i></p>
<p class="extract-9"><span>&(!!char1!!);</span> <i>Mint</i> means
<i>there’s an old roll of Lifesavers under the seat<br /></i></p>
<p class="extract-9"><span>&(!!char1!!);</span> <i>Rare</i> means
<i>most examples of this model fell apart long ago</i></p>
<xsl-code snipped="true"/>
--Chris
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list