xsl-list
[Top] [All Lists]

[xsl] Best approach for writing an XML log whilst processing/writing other XML documents?

2010-08-13 08:29:33
Hi!

I'm facing a problem for which I have found no elegant solution so far.

In summary, I have to process a number of XML files, replace some data
in them and re-write these files into copies. Each XML file can
reference other XML files that also need processing in the same way.
There is no limit to the depth of the dependency links, but it is
guaranteed to be a tree, not a graph (ie. no cycles leading to
infinite processing).

Here is a simplified example:

== doc1.xml  (the starting point, input to the XSLT process) ==
<doc>
    <node replace="id1">text1</node>
    <node>another piece of text</node>
    <ext-doc href="child-doc2.xml"/>
    <node replace="id2">text2</node>
</doc>

== child-doc2.xml ==
<doc>
    <node replace="id3">text3</node>
    <ext-doc href="child-doc3.xml"/>
</doc>

== child-doc3.xml ==
<doc>
    <node replace="id4">text4</node>
</doc>

The aim is to replace the content of //node[exists(@replace)] with
some other value obtained from a lookup table (for the sake of
simplicity below, I simply prefix it with "MOD", as this lookup is not
the focus of my problem)

This is all quite simple, and I achieved it easily in XSLT2 with
xsl:result-document in a recursive fashion

== publish.xsl ==
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
    xmlns:xs="http://www.w3.org/2001/XMLSchema";
        version="2.0">
        
    <xsl:output method="xml" indent="yes"/>

    <xsl:variable name="source" select="."/>

    <xsl:template match="/">
        <xsl:call-template name="copy-file-and-replace">
            <xsl:with-param name="output-name">target.xml</xsl:with-param>
            <xsl:with-param name="doc" select="$source"/>
        </xsl:call-template>
    </xsl:template>

    <xsl:template name="copy-file-and-replace">
        <xsl:param name="doc"/>
        <xsl:param name="output-name"/>

        <xsl:result-document method="xml" href="{$output-name}">
            <xsl:apply-templates select="$doc" mode="replace"/>
        </xsl:result-document>
    </xsl:template>

    <!-- MODE: replace -->

    <xsl:template mode="replace" match="*[(_at_)replace]">
        <xsl:copy>
            <xsl:copy-of select="@* except @replace"/>
            <xsl:text>MOD-</xsl:text>
            <xsl:value-of select="text()"/>
            <xsl:apply-templates mode="#current" select="*"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template mode="replace" match="*|text()">
        <xsl:copy>
            <xsl:copy-of select="@*"/>
            <xsl:apply-templates mode="#current"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template mode="replace" match="ext-doc">
        <xsl:copy-of select="."/>

        <!-- recurse over that file -->
        <xsl:call-template name="copy-file-and-replace">
            <xsl:with-param name="doc" select="doc(@href)"/>
            <xsl:with-param name="output-name"
select="concat(generate-id(), @href)"/>
        </xsl:call-template>
    </xsl:template>

</xsl:stylesheet>

Now comes the problem.  As I do this processing, I need to collect
some information that allow me to report on the process, and output
the dependency tree, and what replacements were made.  I'd like this
output to be XML, and for the example above, something like

== report.xml ==
<report>
    <output file="target.xml">
        <replaced from="text1" to="MOD-text1"/>
        <output file="d1e12child-doc2.xml">
            <replaced from="text3" to="MOD-text3"/>
            <output file="d1e15child-doc3.xml">
                <replaced from="text4" to="MOD-text4"/>
            </output>
        </output>
        <replaced from="text2" to="MOD-text2"/>
    </output>
</report>

Unfortunately, I cannot find a way to generate the 2 in parallel (ie.
the copies of original files and the report), since creation of new
nodes in the mode='replace' templates would obviously go into the
copied files, not the report.
The only way I can think of doing is in a 2-pass algorithm, first
doing all the copying (more=replace), then going through it all again
and produce the report (mode=report), but I hope there is another way
(particularly one that avoids having to go through all dependency
files twice)

Could anyone give me a clue on this?

--
Fabre Lambeau

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--