xsl-list
[Top] [All Lists]

RE: [xsl] Does <xsl:copy> use a lot of memory? Is there an alternative that is more efficient?

2012-09-03 08:57:39
Michael Kay wrote:

In Saxon, and I suspect in most processors, no memory is used for the 
result tree provided that the transformation is writing directly to a 
serializer.

So if I use xsl:copy and the results of the copy are immediately output, then 
there is little or no memory consumption. Yes?

However, in my situation I need to store the results of xsl:copy into a 
variable. Then I process the variable. That processing also uses xsl:copy. I 
put those results into another variable. And again and again.

So in my situation is xsl:copy consuming lots of memory? 

In other words, if I don't immediately output the results of xsl:copy, then 
memory consumption grows and grows. Yes?

/Roger

-----Original Message-----
From: Michael Kay [mailto:mike(_at_)saxonica(_dot_)com] 
Sent: Sunday, September 02, 2012 10:31 AM
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: Re: [xsl] Does <xsl:copy> use a lot of memory? Is there an alternative 
that is more efficient?

Memory is used for the source document and for intermediate variables. 
In Saxon, and I suspect in most processors, no memory is used for the 
result tree provided that the transformation is writing directly to a 
serializer.

Intrinsically, all xsl:copy has to do is to send two events - 
startElement and endElement - to the serializer.

I would strongly suspect that the out of memory error occurs during 
building of the source tree, and will happen whatever transformation  
you run. For a 370Mb input document, you should probably allocate at 
least 2Gb of memory, preferably more.

Michael Kay
Saxonica

On 02/09/2012 13:47, Costello, Roger L. wrote:
Hi Folks,

Does <xsl:copy> use a lot of memory?

Is there an alternative that is more efficient?

Consider this problem. I have an XML document in which some elements have an 
id attribute and others have an idref attribute. If an element A references 
element B, then I want to embed B inside A.

Example: I want to convert this:

<Test>
     <A idref="b" />
     <B id="b" />
</Test>

to this:

<Test>
     <A>
         <B id="b" />
     </A>
     <B id="b" />
</Test>

Notice that A references B, and after processing B is nested inside A.

Here's a template that handles elements with a reference:

     <xsl:key name="ids" match="*[@id]" use="@id"/>

     <xsl:template match="*[@idref]">
         
         <xsl:variable name="refed-element" select="key('ids', @idref)"/>
         
         <xsl:copy>
             <xsl:copy-of select="@* except @idref" />
             <xsl:sequence select="$refed-element" />
         </xsl:copy>
         
     </xsl:template>

The complete program is below.

It works fine if:

(a) The XML document is small.
(b) I don't have to repeat this embedding process too many times.

However, such is not the case. I am dealing with an XML document that is 370 
MB in size and has tens of thousands of references. And I have to repeat the 
embedding process multiple times.

Saxon gives me an "out of memory error."

I suspect the reason for this is due to the <xsl:copy> command. I believe it 
is making new copies, thereby consuming lots of memory. True?

So, is there an alternative to <xsl:copy> that is more efficient?

Is there a way to express the above template rule that is more efficient?

/Roger
-----------------------------------------------------------------------------------------
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
                 exclude-result-prefixes="#all"
                 version="2.0">

     <xsl:output method="xml" />
     
     <xsl:key name="ids" match="*[@id]" use="@id"/>
     
     <xsl:template match="*[@idref]">
         
         <xsl:variable name="refed-element" select="key('ids', @idref)"/>
         
         <xsl:copy>
             <xsl:copy-of select="@* except @idref" />
             <xsl:sequence select="$refed-element" />
         </xsl:copy>
         
     </xsl:template>
     
     
     <xsl:template match="node()">
         
         <xsl:copy>
             <xsl:copy-of select="@*"/>
             <xsl:apply-templates />
         </xsl:copy>
         
     </xsl:template>

</xsl:stylesheet>

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--




--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--


<Prev in Thread] Current Thread [Next in Thread>