xsl-list
[Top] [All Lists]

Re: concatenating 1-n XML files

2002-11-12 09:02:30
1. Can I use the document() function with wildcards in the doc-name ?
something like: <xsl:for-each select="document('name*.xml')"> ? 

Nope. Document expects a URI as its input, and URIs don't have the 
concepts of multiple files or wildcards.

What you could do is develop a list of files first (perhaps using 
extension functions to scan the filesystem to fill in the wildcards, or 
perhaps computing that list before you start the stylesheet and passing it 
in) and then iterate through that list, using document() to retrieve each 
one.

2. would  a large number of files (000's) have an adverse effect
on performance ? Or once a document is processed is it retained
in memory or released ? Is there an upper limit of elements that
can be processed ? I'll be using xalan.

Implementation dependent. I believe Xalan currently caches documents in 
case you refer to the again, so this would burn memory; I can't vouch for 
other processors. There's been discussion of making the memory management 
more controllable and/or smarter, but I don't think those ideas have made 
it into the code yet. (I may be wrong; it's been a while since I've dug 
into that corner.)

If the documents are small enough, Xalan might be able to handle them all 
at once. Storage allocation is a bit weird in Xalan-J, but theoretically 
we could handle up to 2^16 documents at a time as long as no one of them 
is larger than 2^16 nodes; larger documents occupy multiple "DTM ID" slots 
and thus cut into the number of documents available. Realistically, you'll 
run out of memory long before you run into this barrier.

other suggestions

Process the documents individually to extract just the info you need, then 
do a merge on the resulting smaller documents?

If it's really thousands of documents, you might want to consider a 
hardcoded SAX solution. Or -- again -- a SAX preprocessing stage to select 
what you need and merge it into a single document, which would then be 
styled.

______________________________________
Joe Kesselman  / IBM Research

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list



<Prev in Thread] Current Thread [Next in Thread>