xsl-list
[Top] [All Lists]

Re: Dealing with breaking out mixed content

2002-12-07 08:03:26
Hello Chris,

Chris Loschen wrote:
I have some mixed content paragraphs which I need to break up into multiple paragraphs, each one beginning with a specified string. See my sample input and desired output below. (I'll clean up some of the other cruft like the extra br elements later.) I tried to set up some templates (which I've included below) for this based on the example in Michael Kay's XSLT 2nd ed., p. 550. However, Mr. Kay's script was dealing with a pure string of #PCDATA and I've got nodes mixed in, so I'm getting an error that I cannot "convert #STRING to a NodeList!" (If I've misunderstood any of that, please let me know.) What am I doing wrong? Or do I need to totally rethink my approach, because "substring-before" and "substring-after" are meant to work on strings, not mixed content?

Yes! It's XML, nodes, elements, ... and not simply a text string.

If that's it, is there a way to accomplish what I  need to do?

It's a grouping problem.

Using this XSLT:

<xsl:key name="nodes" match="/p/node()" use="generate-id((../node()[1] |
    (preceding-sibling::span|self::span)[starts-with(.,
        '&amp;(!!char1!!);')])[last()])"/>

<xsl:template match="/p">
    <root>
<xsl:apply-templates select="node()[1] | span[starts-with(., '&amp;(!!char1!!);')]" mode="start-group"/>
    </root>
</xsl:template>

<xsl:template match="node()" mode="start-group">
    <p class="extract-9">
        <xsl:copy-of select="key('nodes', generate-id())"/>
    </p>
</xsl:template>

I get an output, which looks at least similar to your desired output. The nodes are grouped by their preceding-sibling::span containing 'char1'. If the "current" node itself is a 'char1'-span element, it must not use preceding-sibling, but itself. And because it seems to be possible, that the first node doesn't contain 'char1', you must access it by hand. This results in

(../node()[1] |
    (preceding-sibling::span | self::span)
        [starts-with(.,'&amp;(!!char1!!);')])
[last()]

Everything clear?

Regards,

Joerg

Thanks again for all your previous assistance, and thank you in advance for helping me with this one.


SAMPLE INPUT:

<p class="extract-9"><span class="extract-7"><b>Did You Know?</b></span>How to read a car ad:<span>&amp;(!!char1!!);</span> <i>Low mileage</i> means <i>the odometer doesn&rsquo;t work<br /></i><span>&amp;(!!char1!!);</span> <i>All original</i> means <i>needs new everything<br /></i><span>&amp;(!!char1!!);</span> <i>Health forces sale</i> means <i>I&rsquo;m sick of this car<br /></i><span>&amp;(!!char1!!);</span> <i>Must see</i> means <i>I won&rsquo;t put anything in writing<br /></i><span>&amp;(!!char1!!);</span> <i>Runs like a top</i> means <i>wobbles when driven slowly<br /></i><span>&amp;(!!char1!!);</span> <i>Mint</i> means <i>there&rsquo;s an old roll of Lifesavers under the seat<br /></i><span>&amp;(!!char1!!);</span> <i>Rare</i> means <i>most examples of this model fell apart long ago</i></p>

DESIRED OUTPUT:

<p class="extract-9"><span class="extract-7"><b>Did You Know?</b></span>How to read a car ad:</p> <p class="extract-9"><span>&amp;(!!char1!!);</span> <i>Low mileage</i> means <i>the odometer doesn&rsquo;t work<br /></i></p> <p class="extract-9"><span>&amp;(!!char1!!);</span> <i>All original</i> means <i>needs new everything<br /></i></p> <p class="extract-9"><span>&amp;(!!char1!!);</span> <i>Health forces sale</i> means <i>I&rsquo;m sick of this car<br /></i></p> <p class="extract-9"><span>&amp;(!!char1!!);</span> <i>Must see</i> means <i>I won&rsquo;t put anything in writing<br /></i></p> <p class="extract-9"><span>&amp;(!!char1!!);</span> <i>Runs like a top</i> means <i>wobbles when driven slowly<br /></i></p> <p class="extract-9"><span>&amp;(!!char1!!);</span> <i>Mint</i> means <i>there&rsquo;s an old roll of Lifesavers under the seat<br /></i></p> <p class="extract-9"><span>&amp;(!!char1!!);</span> <i>Rare</i> means <i>most examples of this model fell apart long ago</i></p>

<xsl-code snipped="true"/>

--Chris


XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list



<Prev in Thread] Current Thread [Next in Thread>