You're welcome Kevin - please let us know what your findings are!
Cheers,
<prs/>
-----Original Message-----
From: Kevin Rodgers [mailto:kevin(_dot_)rodgers(_at_)ihs(_dot_)com]
Sent: Jueves, 20 de Enero de 2005 12:09 p.m.
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: Re: [xsl] optimization for very large, flat documents
Thanks to everyone who responded. For now I plan to follow Pieter's idea of
chunking the data into manageable pieces (16-64 MB). Then I'm going to look
into Michael's suggestions about STX (unfortunately, not yet a W3C
recommendation and thus not widely implemented) and XQuery.
For anyone interested in some numbers, I've split each of my 2 large files
(613 MB and 656 MB) into subfiles of 16 K independent entries (which vary in
size), yielding sets of 25 and 37 subfiles (of approx. 25 MB and 17 MB each,
respectively). I process them by running Saxon 8.2 from the command line
(with an -Xmx value of 8 times the file size) on a Sun UltraSPARC with 2 GB
of real memory. The set of 37 17 MB XML subfiles are processed with a
slightly simpler stylesheet, and take about 1:15 (minutes:seconds) each; the
set of 25 25 MB XML subfiles use
1 document() call per entry to/from a servlet on a different host and take
about 8 minutes each.
My next step is to use Saxon's profiling features to find out where I can
improve my stylesheet's performance.
Thanks again to everyone on xsl-list for all your help!
--
Kevin Rodgers
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--