xsl-list
[Top] [All Lists]

Re: Transforming XML Blockquotes - Mixed Content - XSLT 2.0 Solution

2005-04-14 09:47:10
Here's an XSLT 2.0 solution to the problem. It involves two stylesheets 
(though it could be combined with a bit more effort). The algorithm goes 
like this:
First, chunk up all the bits, which turns this into a grouping problem.
Second, solve the grouping problem.

Given the following XML file:

<doc>
  <paragraph num="1">Yadda Yadda Yadda <italic>Italic Yadda</italic> 
Yadda: <blockquote>Blah Blah Blah Blah</blockquote> Yackity Yack 
Yack</paragraph>
</doc>

Use this XSL file:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>

  <xsl:template match="/">
    <x>
      <xsl:apply-templates/>
    </x>
  </xsl:template>

  <xsl:template match="paragraph">
    <xsl:apply-templates/>
  </xsl:template>

  <xsl:template match="paragraph/text()">
    <p num="{../@num}" 
group="{count(preceding-sibling::blockquote)}"><xsl:value-of 
select="."/></p>
  </xsl:template>

  <xsl:template match="blockquote">
    <blockquote><xsl:apply-templates/></blockquote>
  </xsl:template>

  <xsl:template match="italic">
    <p num="{../@num}" 
group="{count(preceding-sibling::blockquote)}"><span 
style="font-style:italic"><xsl:apply-templates/></span></p>
  </xsl:template>

</xsl:stylesheet>

To create the chunks, thus:

<?xml version="1.0" encoding="UTF-8"?>
<x>
  <p num="1" group="0">Yadda Yadda Yadda </p>
  <p num="1" group="0"><span style="font-style:italic">Italic 
Yadda</span></p>
  <p num="1" group="0"> Yadda: </p>
  <blockquote>Blah Blah Blah Blah</blockquote>
  <p num="1" group="1"> Yackity Yack Yack</p>
</x>

Now it's a grouping problem, which can be solved in XSLT 2.0 with this 
stylesheet:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>

  <xsl:key name="mixed" match="p" use="@group"/>

  <xsl:template match="x">
    <html>
      <head>
        <title>Paragraph Chunking Test</title>
      </head>
      <body>
        <xsl:for-each-group select="p" group-by="@group">
          <p>
            <xsl:for-each select="../p[(_at_)group=current-grouping-key()]">
              <xsl:apply-templates/>
            </xsl:for-each>
          </p>
          <xsl:apply-templates 
select="current-group()/following-sibling::blockquote"/>
        </xsl:for-each-group>
      </body>
    </html>
  </xsl:template>

  <xsl:template match="p"/>

  <xsl:template match="blockquote">
    <xsl:copy-of select="."/>
  </xsl:template>

  <xsl:template match="span">
    <xsl:copy-of select="."/>
  </xsl:template>

</xsl:stylesheet>

which yields:

<html>
   <head>
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
 
      <title>Paragraph Chunking Test</title>
   </head>
   <body>
      <p>Yadda Yadda Yadda <span style="font-style:italic">Italic 
Yadda</span> Yadda: 
      </p>
      <blockquote>Blah Blah Blah Blah</blockquote>
      <p> Yackity Yack Yack</p>
   </body>
</html>

I have not yet gotten an XSLT 1.0 grouping solution for this problem (I 
don't have much time to spend on this issue). I am sending along the XSLT 
2.0 solution (which took me perhaps 10 minutes to do - I love 
xsl:for-each-group) just to show one workable (IMHO) way to solve the 
problem. James Fuller has shown us another, but I think there's value in 
multiple approaches.

I tested it all with Saxon 8.4, by the way.

Jay Bryant
Bryant Communication Services
(presently consulting at Synergistic Solution Technologies)

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--



<Prev in Thread] Current Thread [Next in Thread>