Hi,
I have the following task:
Create an arbitrary formatted file (XML/HTML/CSV whatever) based on a
Select from a database.
As a constraint the amount of data fetched from the database can not
be stored in memory as a whole.
Another constraint is that I can not use XML-functionality in the
database, I have to implement the functionality on top of our database
access framework. This database access framework fetches record for
record one after another.
And I have to use Java and Xalan.
My idea was to decorate every fetched row from the database with simple
generic XML and fire this to Xalan.
Let do an example:
If my result set from the database looks like:
ID Name Description
-- ---- -----------
1 "dog" "an animal may be dangerous"
2 "cat" "an animal likes milk"
I create the following XML:
<?xml version="1.0" encoding="UTF-8"?>
<dataset>
<row>
<value>1</value>
<value>dog</value>
<value>an animal may be dangerous</value>
</row>
<row>
<value>2</value>
<value>cat</value>
<value>an animal likes milk</value>
</row>
</dataset>
I create this XML as "Sax fire events" in an java
class[StringArrayXMLReader], which implements the org.xml.sax.XMLReader
interface.
I have three methods:
public void init() throws SAXException {
ch.startDocument( );
ch.startElement("","dataset","dataset",EMPTY_ATTR);
}
public void close() throws SAXException {
ch.endElement("","dataset","dataset");
ch.endDocument( );
}
public void parse(String [] input) throws SAXException {
ch.startElement("","row","row",EMPTY_ATTR);
for (int i = 0; i< input.length; ++i){
ch.startElement("","value","value",EMPTY_ATTR);
ch.characters(input[i].toCharArray(), 0,input[i].length( ));
ch.endElement("","value","value");
}
ch.endElement("","row","row");
}
The parse method creates the <row>...</row> entries for an overhanded
String array.
The StringArrayXMLReader is associated with a TransformerHandler, which
uses a XSL stylesheet to transform the XML to the desired output.
What happens here is, that when the fetch from the database starts I
call init() ( and thus startDocument() ) and at last, after the fetch
finished, I call close() (and thus endDocument()).
I observed that the xslt processing starts when endDocument() is called.
This is not acceptable for me because I fear the xslt processor reads
all the rows into memory until endDocument() is called and in this case
I take a risk to run in OutOfMemory.
My second idea was to eliminate the init()/close() methods and to
consider one <row>...</row> section as complete document input for the
processor. This has the disadvantage that I have to create the head and
tail of the document manually (and in my example I get a
NullPointerException when I the transformer is called twice).
I have the following questions:
Is it possible to create the output without having the whole data in
memory ?
The basis XML for xslt processing
<dataset>
<row><value>...
<row><value>...
</dataset>
looks very simple and the supplied XLS stylesheets will be not complex
so my hope is to get it working.
I also think that the task in general - produce formatted output from a
potential very large data pool - should be a common one.
Unfortunately I did not do much xslt-processing in the past so I lack
the experience (a bit libxslt which I feed a DOM tree).
If someone has some striking links I would very glad to
hear. My test code I provide at:
http://randspringer.de/sax_row.tar and
http://randspringer.de/sax.tar
If someone could have a look at it I would really appreciate it.
Thomas
--
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--