xsl-list
[Top] [All Lists]

[xsl] Transform a million XML documents

2017-02-10 12:43:15
Hi Folks,

Eliot Kimber raised a neat question on the SAXON mailing list. 

Here is a summary of the ensuing discussion.

Scenario: There are a million XML documents that need to be transformed. Each 
file is in the 1-4KB range. The files are organized into directories about 4 or 
5 deep and some directories have 100s or 1000s of files.

Use XSLT to do the transformations. 

Specifically, use the XSLT collection() function along with 
saxon:discard-document().

Transforming a million files is easily handled by Saxon-EE, which uses multiple 
threads for document parsing (equally xsl:result-document will use multiple 
threads for writing the result). A key thing to remember is to use 
saxon:discard-document() to ensure that the documents are garbage collected 
after processing.

Here is the XSLT code:

<xsl:for-each select="for $x in 
collection('file:///c:/path/to/xml?select=*.xml;recurse=yes;on-error=ignore') 
return saxon:discard-document(f.doTransform($x))">
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--

<Prev in Thread] Current Thread [Next in Thread>