xsl-list
[Top] [All Lists]

Fwd: Re: [xsl] breaking up XML on page break element

2014-07-05 11:14:50
[RETRY 4 - apologies for the duplicates; I'm trying to configure my sending email address to the list; this was first sent to Geert on Friday]

Date: Fri, 04 Jul 2014 13:15:53 -0400
To: <xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com>, 
xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
From: "G. Ken Holman" <gkholman(_at_)CraneSoftwrights(_dot_)com>
Subject: Re: [xsl] breaking up XML on page break element

At 2014-07-04 10:56 -0400, I wrote:
At 2014-07-04 14:43 +0000, Geert Bormans 
geert(_at_)gbormans(_dot_)telenet(_dot_)be wrote:
Now I want to break the document per page, reconstructing the structure

Use "<<" and ">>" to determine those elements that are before, after and in-between page breaks, then recreate the document structure up to those points.

My solution does not offer the advanced features that Gerrit suggested, nor is it as elegant a solution as he describes ... it is more brute force and probably slower in execution. It is simply a modification of the identity transform using ">>" and "<<", but I think it is giving the correct results.

I hope it helps.

. . . . . . . . Ken

t:\ftemp>type geert.xml
<?xml version="1.0" encoding="UTF-8"?>
<book>
  <title>...</title>
  <section no="1">
    <para>aaa<pb/>bbb</para>
  </section>
  <section no="2">
    <para>2-1</para>
    <para>ccc<pb/>ddd<pb/>eee</para>
    <para>2-3</para>
    <subsection>
      <para>fff<pb/>ggg<span>hhh<pb/>jjj</span>kkk<pb/>mmm</para>
    </subsection>
    <pb/>
    <para>2-5</para>
  </section>
  <section>
    <para>No breaks</para>
  </section>
</book>

t:\ftemp>call xslt2 geert.xml geert.xsl geert.out.xml

t:\ftemp>type geert.out.xml
<?xml version="1.0" encoding="UTF-8"?><book>
  <title>...</title>
  <section no="1">
    <para>aaa</para></section>
<pb/>
<section no="1"><para>bbb</para>
  </section><section no="2">
    <para>2-1</para>
    <para>ccc</para></section>
<pb/>
<section no="2"><para>ddd</para></section>
<pb/>
<section no="2"><para>eee</para>
    <para>2-3</para>
    <subsection>
      <para>fff</para></subsection></section>
<pb/>
<section no="2"><subsection><para>ggg<span>hhh</span></para></subsection></section>
<pb/>
<section no="2"><subsection><para><span>jjj</span>kkk</para></subsection></section>
<pb/>
<section no="2"><subsection><para>mmm</para>
    </subsection>
    </section>
<pb/>
<section no="2">
    <para>2-5</para>
  </section><section>
    <para>No breaks</para>
  </section>
</book>
t:\ftemp>type geert.xsl
<?xml version="1.0" encoding="US-ASCII"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
  xmlns:xsd="http://www.w3.org/2001/XMLSchema";
  exclude-result-prefixes="xsd"
  version="2.0">

<xsl:template match="book">
  <xsl:copy>
    <xsl:copy-of select="@*"/>
    <!--everything before the first section-->
    <xsl:copy-of select="node()[. &lt;&lt; current()/section[1] ]"/>
    <!--prepare to revisit all sections-->
    <xsl:variable name="sections" select="section"/>
    <xsl:choose>
     <!--simply copy the content if there are no page-breaks-->
     <xsl:when test="empty(.//pb)">
       <xsl:copy-of select="$sections"/>
     </xsl:when>
     <xsl:otherwise>
       <!--copy down from the apex repeatedly for each pb-->
       <xsl:for-each select=".//pb">
         <xsl:apply-templates select="$sections" mode="section">
<xsl:with-param tunnel="yes" name="after" select="preceding::pb[1]"/>
          <xsl:with-param tunnel="yes" name="before" select="."/>
          <xsl:with-param tunnel="yes" name="ancestors"
select="ancestor::* | preceding::pb[1]/ancestor::*"/>
         </xsl:apply-templates>
         <xsl:text>&#xa;</xsl:text>
         <pb/>
         <xsl:text>&#xa;</xsl:text>
       </xsl:for-each>
       <!--copy everything after the last page break-->
       <xsl:apply-templates select="$sections" mode="section">
        <xsl:with-param tunnel="yes" name="after" select="(.//pb)[last()]"/>
        <xsl:with-param tunnel="yes" name="ancestors"
                        select="(.//pb)[last()]/ancestor::*"/>
       </xsl:apply-templates>
     </xsl:otherwise>
    </xsl:choose>
    <!--everything after the last section-->
    <xsl:copy-of select="node()[. >> current()/section[last()] ]"/>
  </xsl:copy>
</xsl:template>

<xsl:template match="@*|node()" mode="#default section">
  <xsl:param tunnel="yes" name="after"/>
  <xsl:param tunnel="yes" name="before"/>
  <xsl:param tunnel="yes" name="ancestors" as="element()*"/>
  <xsl:choose>
    <xsl:when test="( . intersect $ancestors ) or
                    ( not($after) and not($before) ) or
                    ( not($after) and . &lt;&lt; $before ) or
                    ( . >> $after and not($before)  ) or
                    ( . >> $after and . &lt;&lt; $before )">
      <xsl:copy>
        <xsl:copy-of select="@*"/>
        <xsl:apply-templates/>
      </xsl:copy>
    </xsl:when>
    <xsl:otherwise>
      <xsl:apply-templates/>
    </xsl:otherwise>
  </xsl:choose>
</xsl:template>

</xsl:stylesheet>
t:\ftemp>rem Done!


--
Contact us for world-wide XML consulting and instructor-led training |
Free 5-hour lecture: http://www.CraneSoftwrights.com/links/udemy.htm |
Crane Softwrights Ltd.            http://www.CraneSoftwrights.com/s/ |
G. Ken Holman                   mailto:gkholman(_at_)CraneSoftwrights(_dot_)com 
|
Google+ profile:      http://plus.google.com/+GKenHolman-Crane/about |
Legal business disclaimers:    http://www.CraneSoftwrights.com/legal |


---
This email is free from viruses and malware because avast! Antivirus protection 
is active.
http://www.avast.com
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--

<Prev in Thread] Current Thread [Next in Thread>