xsl-list
[Top] [All Lists]

Re: [xsl] Converting milestone tags

2010-10-14 02:45:45
This class of problems is quite tricky. The most general approach is to flatten the first hierarchy, so everything is reduced to milestones, and then use positional grouping to construct the new hierarchy from the flat structure.

If you have access to a good library, try looking for Michael Jackson's 1970s books on Jackson Structured Programming, where he tackles this class of problem under the heading of "boundary conflict". The vocabulary is different - it's all about sequential processing of hierarchic files on magnetic tape - but the logic is the same, and it's the most systematic treatment I've seen. Essentially he shows that if the hierarchic structure of the input and output are in some sense congruent, then a single tree walk over the input can handle the problem, but if they aren't then you can devise a new intermediate hierarchy - perhaps very flat - that is congruent with both the input and the output, so one tree walk will get you from the input to the intermediate tree, and a second tree walk will get you from the intermediate tree to the output. (This is assuming of course that you don't have an ordering conflict, which is true in your case).

Your example doesn't need the full generality of this approach, because the start/end milestones are always siblings and are always matched in the same paragraph, but your discussion indicates that you might want to tackle things that go beyond this example.

Michael Kay
Saxonica

On 14/10/2010 8:05 AM, Josef Schneeberger wrote:
Hi everybody,

I am new to this list and apologize, if my question is an FAQ. I scanned
the archives, but did not find a solution. The question arises in a TEI
project where we have to switch from a chapter hierarchy to a page
oriented form. The XSLT is done in multiple steps (a cocoon pipeline)
and I use Saxon9.

Here is a simplified example of an infile:

<root>
  <p>text<span order="start"/>text<span order="end"/>  text</p>
  <p>text<span order="start"/>text<span order="end"/>  text
     text<span order="start"/>text<span order="end"/>  text</p>
  <p>text text text<span order="start"/>text<span order="end"/></p>
  <p><span order="start"/>text<span order="end"/>  text text text</p>
</root>

which should result in the following output:

<root>
  <p>text<span>text</span>  text</p>
  <p>text<span>text</span>  text
     text<span>text</span>  text</p>
  <p>text text text<span>text</span></p>
  <p><span>text</span>  text text text</p>
</root>

There my be an arbitrary number of<span order="begin"/>  (and
corresponding end milestone tags) in a p element. Furthermore, any
"text" node may again contain markup which should be preserved in the
output. I tried various approaches but I failed. Here is one of my
attempts using sibling recursion ...

<xsl:stylesheet version="2.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
  <xsl:template match="/">
   <xsl:apply-templates/>
  </xsl:template>
        
  <xsl:template match="root">
   <root><xsl:apply-templates/></root>
  </xsl:template>
        
  <xsl:template match="p">
   <p>
    <xsl:apply-templates select="child::node()" mode="procp"/>
   </p>
  </xsl:template>
        
  <xsl:template match="span[(_at_)order='start']" mode="procp">
   <span>
    <xsl:apply-templates
      select="following-sibling::node()[1][not(self::span)]"
      mode="procp"/>
   </span>
   <xsl:apply-templates select="following-sibling::node()[1]"/>
  </xsl:template>
        
  <xsl:template match="node()" mode="procp">
   <xsl:copy-of select="."/>
    <xsl:apply-templates
       select="following-sibling::node()[1][not(self::span)]"
       mode="procp"/>
  </xsl:template>
</xsl:stylesheet>

Any help would be greatly appreciated. Josef



--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--

<Prev in Thread] Current Thread [Next in Thread>