xsl-list
[Top] [All Lists]

[xsl] Split into numbered files: without side-effect? (XSLT 2)

2007-09-26 08:43:48
Hi all,

until now I have been coping quite well without being able to use any
side effects in XSLT, but for the ridiculously simple task I am trying
to do, I just don't see how to avoid "cheating" by shamefully falling
back on saxon:assign in XSLT 2.

I would like to split the contents of my input file into a series of
chunks, each going into a separate text file, consecutively numbered
in the order in which the chunks are created (and all of the text
nodes should go into the main result document as well). The trouble
comes in from these requirements:

a) The list of elements making up the chunks should be configurable:
   their paths from the root node are to be read from an XML file.

b) The chunks need not be rooted at the same level in the tree.

For instance, assume I have this input document:

<?xml version="1.0" encoding="iso-8859-1"?>
<root>
  <a>
    <content>text of a</content>
  </a>
  <b>
    <b1>
      <content>text of b1</content>
    </b1>
    <b2>
      <content>text of b2</content>
    </b2>
  </b>
  <c>
    <content>text of c</content>
  </c>
</root>

Another file, named chunk_starting_elements.xml, says for which
element instances to start a new chunk, by giving their local name
(this is just for simplicity's sake, in reality I am using their paths
from the root node):

<?xml version="1.0" encoding="iso-8859-1"?>
<chunks>
  <element>a</element>
  <element>b1</element>
  <element>b2</element>
  <element>c</element>
</chunks>

The desired result of the transformation is a series of 4 files named

chunk_01_a.txt
chunk_02_b1.txt
chunk_03_b2.txt
chunk_04_c.txt

each of them containing the text nodes from the corresponding element
as given in chunk_starting_elements.xml (plus all of it in the main
result document).

While the stylesheet below certainly does the job (provided it's Saxon
which processes it), I wonder how to adapt it into pure, side-effect
free XSLT 2. Any suggestions?

  Yves


<?xml version="1.0" encoding="iso-8859-1"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
  xmlns:xs="http://www.w3.org/2001/XMLSchema";
  xmlns:my="http://xmlns.srz.de/yforkl/xslt/functions";
  xmlns:saxon="http://saxon.sf.net/";
  extension-element-prefixes="saxon"
  exclude-result-prefixes="my xs saxon">

  <xsl:output method="text" indent="yes"/>

  <xsl:variable name="chunk-number" saxon:assignable="yes" select="0"/>

  <xsl:function name="my:chunk-started-by" as="node()*">
    <xsl:param name="context-element" as="node()"/>
    <xsl:sequence
      select="document('chunk_starting_elements.xml')
                /chunks/element[. = local-name($context-element)]"/>
  </xsl:function>

  <xsl:template match="*">

    <xsl:variable name="new-chunk-name"
      select="my:chunk-started-by(.)"/>

    <xsl:if test="$new-chunk-name">
      <xsl:result-document
        href="{concat('chunk_', format-number($chunk-number + 1, '00'),
                      '_', $new-chunk-name, '.txt')}">
        <saxon:assign name="chunk-number" select="$chunk-number + 1"/>
        <xsl:apply-templates/>
      </xsl:result-document>
    </xsl:if>

    <xsl:apply-templates/>

  </xsl:template>

</xsl:stylesheet>

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--