xsl-list
[Top] [All Lists]

Re: Max size?

2003-01-09 07:05:18
Michael Kay wrote:
I can't speak with any authority about Xalan, but my understanding is
that it builds the tree concurrently with doing the transformation; by
the time you've finished, you will normally have the complete tree in
memory. This has benefits, but the memory you need still increases
linearly with document size.
I don't think so. You can process much larger inputs once
incremental processing is enabled, and the style sheet
fits, without running out of memory. Well, it still runs
out of memory ultimately, but I don't think it's only
streaming the result, significant parts of information
associated with the input seems to be discarded too.

The interesting challenge is to work out when you can discard parts of
the tree that won't be needed again. I think this could be done quite
easily for a small class of very simple stylesheets, but the general
problem is quite hard.

I think it should be possible to assert by static analysis
whether a certain template only accesses descendants of the
context node. If this can be asserted for all templates in
the style sheet, and if you can arrange the processing
within a template so that nodes are only accessed once
locally, you can discard nodes processed by directly called
templates from memory.
Making such assertions shouldn't be that hard if the XPath
expressions within the templates use only nodes form the
descendant-or-self axis. It may be an indication that Xalan's
memory usage increases quite a bit for the same input if the
stylesheet uses a sibling axis somewhere, even if the same
result is produced.
One of the more interesting questions: If you have a schema
for the input and can afford to barf in mid-processing if
the input doesn't validate, the structure information should
allow far better assertions from static analysis.

J.Pietschmann


XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list



<Prev in Thread] Current Thread [Next in Thread>