At 2003-06-04 14:36 -0500, RPeterson(_at_)dstsystems(_dot_)com wrote:
I have multiple xml files that I wish to merge. An example of the files is
below. Note that a file
may contain thousands of <detail> nodes.
Having chosen to use XSLT, all these nodes will need to be in memory in
source node trees.
I have been using a master xml file to try and merge but with no success in
keeping the clients together.
This is a grouping issue, and grouping across separate source files is very
straightforward when using the variable-based grouping method. A working
result is below (modulo how you want to handle the whitespace when
indenting ... I advise that after confirming this is what you want to
happen that you remove the <xsl:output> instruction because of the
arbitrary indenting).
You'll see that it isn't very long at all.
Also, due to the size of the files, is using xsl:copy and xsl:copy-of the
best way to approach this.
"Best" in what sense? Any choice here will not impact on your memory
capacity issues regarding storing the source node trees ... it is in the
architecture of XSLT (much like with DOM) that the entire source node tree
of any source file be loaded in memory before any node of that tree is
accessed ... and there are no guidelines for garbage collection and freeing
memory.
I hope this helps.
................ Ken
T:\ftemp>type peterson.xml
<masterfile>
<doc filename="file1.xml"/>
<doc filename="file2.xml"/>
<doc filename="file3.xml"/>
</masterfile>
T:\ftemp>type file1.xml
<client>
<name>A</name>
<subclient>
<name>a</name>
<detail>....</detail>
<detail>....</detail>
</subclient>
</client>
T:\ftemp>type file2.xml
<client>
<name>A</name>
<subclient>
<name>b</name>
<detail>....</detail>
<detail>....</detail>
</subclient>
</client>
T:\ftemp>type file3.xml
<client>
<name>B</name>
<subclient>
<name>c</name>
<detail>....</detail>
<detail>....</detail>
</subclient>
</client>
T:\ftemp>type peterson.xsl
<?xml version="1.0" encoding="iso-8859-1"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:output indent="yes"/>
<xsl:template match="masterfile">
<merged>
<xsl:variable name="clients" select="document(doc/@filename)/client"/>
<xsl:for-each select="$clients">
<xsl:if test="generate-id(.)=
generate-id($clients[name=current()/name])">
<client>
<xsl:copy-of select="name"/>
<xsl:copy-of select="$clients[name=current()/name]/subclient"/>
</client>
</xsl:if>
</xsl:for-each>
</merged>
</xsl:template>
</xsl:stylesheet>
T:\ftemp>saxon -o peterson.out peterson.xml peterson.xsl
T:\ftemp>type peterson.out
<?xml version="1.0" encoding="utf-8"?>
<merged>
<client>
<name>A</name>
<subclient>
<name>a</name>
<detail>....</detail>
<detail>....</detail>
</subclient>
<subclient>
<name>b</name>
<detail>....</detail>
<detail>....</detail>
</subclient>
</client>
<client>
<name>B</name>
<subclient>
<name>c</name>
<detail>....</detail>
<detail>....</detail>
</subclient>
</client>
</merged>
--
Upcoming hands-on courses: (registration still open!)
- (XSLT/XPath and/or XSL-FO) North America: June 16-20, 2003
G. Ken Holman mailto:gkholman(_at_)CraneSoftwrights(_dot_)com
Crane Softwrights Ltd. http://www.CraneSoftwrights.com/s/
Box 266, Kars, Ontario CANADA K0A-2E0 +1(613)489-0999 (F:-0995)
ISBN 0-13-065196-6 Definitive XSLT and XPath
ISBN 0-13-140374-5 Definitive XSL-FO
ISBN 1-894049-08-X Practical Transformation Using XSLT and XPath
ISBN 1-894049-11-X Practical Formatting Using XSL-FO
Member of the XML Guild of Practitioners: http://XMLGuild.info
Male Breast Cancer Awareness http://www.CraneSoftwrights.com/s/bc
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list