xsl-list
[Top] [All Lists]

Re: [xsl] "Heap" of trouble handling input file of 500 MByte

2011-02-21 17:21:05
On 21/02/2011 21:24, thehulk(_at_)comcast(_dot_)net wrote:
Thanks for all these suggestions. I tried to use Saxon but ran into typical 
problems. I have not found an endorsed dir anywhere, and after looking at about 
a dozen webpages, I am ready to give up and ask: how to put it into the 
endorsed dir? Also very usable: how to make this one program use the Saxon 
classes?
I did download files saxon8.jar and saxon9he.jar .


If your application is using the JAXP interfaces, and if you have access to the source code, then by far the simplest way to switch it to using Saxon is to change the line

TransformerFactory f = TransformerFactory.newInstance();

to say instead

TransformerFactory f = new net.sf.saxon.TransformerFactoryImpl();

If you don't have access to the source code, then the simplest is to set the Java system property javax.xml.transform.TransformerFactory to the value "net.sf.saxon.TransformerFactoryImpl".

However, if you are processing 500Mb input documents you are still going to need rather more than the 1Gb of heap space that you seem able to allocate: I would recommend a minimum of 4x input document size, and it can be more than that depending on the nature of the XML source.

A surprisingly simple measure that is often overlooked is to set <xsl:strip-space elements="*"/> - however, it doesn't make much difference on recent versions of Saxon because Saxon will compress the whitespace anyway.

Michael Kay
Saxonica

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--