XSLT processors generally read the whole document into memory. Some products
may be able to avoid this under certain circumstances, for example see
http://www.saxonica.com/documentation/sourcedocs/serial.html for Saxon.
Running one transformation per row is certainly feasible in principle though
there may be a significant start-up overhead - you'll only find out by
measurement.
Alternatively, why not retrieve the data from the database in
transformer-sized chunks?
Michael Kay
http://www.saxonica.com/
-----Original Message-----
From: Thomas Porschberg [mailto:thomas(_dot_)porschberg(_at_)osp-dd(_dot_)de]
Sent: 19 April 2006 13:36
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: [xsl] memory usage of xslt processing
Hi,
I have the following task:
Create an arbitrary formatted file (XML/HTML/CSV whatever)
based on a Select from a database.
As a constraint the amount of data fetched from the database
can not be stored in memory as a whole.
Another constraint is that I can not use XML-functionality in
the database, I have to implement the functionality on top of
our database access framework. This database access framework
fetches record for record one after another.
And I have to use Java and Xalan.
My idea was to decorate every fetched row from the database
with simple generic XML and fire this to Xalan.
Let do an example:
If my result set from the database looks like:
ID Name Description
-- ---- -----------
1 "dog" "an animal may be dangerous"
2 "cat" "an animal likes milk"
I create the following XML:
<?xml version="1.0" encoding="UTF-8"?>
<dataset>
<row>
<value>1</value>
<value>dog</value>
<value>an animal may be dangerous</value> </row> <row>
<value>2</value>
<value>cat</value>
<value>an animal likes milk</value>
</row>
</dataset>
I create this XML as "Sax fire events" in an java
class[StringArrayXMLReader], which implements the
org.xml.sax.XMLReader interface.
I have three methods:
public void init() throws SAXException {
ch.startDocument( );
ch.startElement("","dataset","dataset",EMPTY_ATTR);
}
public void close() throws SAXException {
ch.endElement("","dataset","dataset");
ch.endDocument( );
}
public void parse(String [] input) throws SAXException {
ch.startElement("","row","row",EMPTY_ATTR);
for (int i = 0; i< input.length; ++i){
ch.startElement("","value","value",EMPTY_ATTR);
ch.characters(input[i].toCharArray(),
0,input[i].length( ));
ch.endElement("","value","value");
}
ch.endElement("","row","row");
}
The parse method creates the <row>...</row> entries for an
overhanded String array.
The StringArrayXMLReader is associated with a
TransformerHandler, which uses a XSL stylesheet to transform
the XML to the desired output.
What happens here is, that when the fetch from the database
starts I call init() ( and thus startDocument() ) and at
last, after the fetch finished, I call close() (and thus
endDocument()).
I observed that the xslt processing starts when endDocument()
is called.
This is not acceptable for me because I fear the xslt
processor reads all the rows into memory until endDocument()
is called and in this case I take a risk to run in OutOfMemory.
My second idea was to eliminate the init()/close() methods
and to consider one <row>...</row> section as complete
document input for the processor. This has the disadvantage
that I have to create the head and tail of the document
manually (and in my example I get a NullPointerException when
I the transformer is called twice).
I have the following questions:
Is it possible to create the output without having the whole
data in memory ?
The basis XML for xslt processing
<dataset>
<row><value>...
<row><value>...
</dataset>
looks very simple and the supplied XLS stylesheets will be
not complex so my hope is to get it working.
I also think that the task in general - produce formatted
output from a potential very large data pool - should be a common one.
Unfortunately I did not do much xslt-processing in the past
so I lack the experience (a bit libxslt which I feed a DOM tree).
If someone has some striking links I would very glad to hear.
My test code I provide at:
http://randspringer.de/sax_row.tar and
http://randspringer.de/sax.tar
If someone could have a look at it I would really appreciate it.
Thomas
--
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail:
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--