xsl-list
[Top] [All Lists]

Re: GByte Transforms

2004-06-02 15:24:50

Hi Jeff,

Comments below.

On Wednesday 02 June 2004 21:40, Jeff Kenton wrote:

Ignore the problem, leave to stylesheet writer testing.

Can't do this.  People do the strangest things in stylesheets, as any
reader of this list knows.  Your job is to take anything a customer might
throw your way, no matter how weird, and "do the right thing".

Ok. I was really just trying that one on. I think performance would be too 
unpredictable if we did that which is why we could do with some thoughts on 
this.


Extra smarts in the compiler to warn of the use of potentially non-linear
behaviour. E.G. Recursive templates not being tail recursive, nested
loop/ template constructions.

As above but aided by structural information for better targeting.

Sure, the more diagnostics for the user, the better.  But be prepared for
users ignoring them, and customers that set things up so that users never
see the warnings.

That is a rather an obvious drawback of just warning about potential problems, 
although I have always liked the high level of feedback Saxon gives about 
stylesheet issues.

Subset XSLT to limit the scope for non-linear transforms.

It often comes down to that.  The other way to look at the problem of
"streaming" large input files is to analyze the stylesheet and try to
decide how much of the input you need to keep during processing. For some
operations, only the current node is necessary.  More often, keeping just
the path from the root to the current node will work (as another poster
suggested). Sometimes, you need the entire input tree, and you're not
really "streaming" anymore.  Consider it a continuum, rather than just a
binary "can I stream this stylesheet or not" question.

Nice point about it not being a binary choice although I think streamability 
is perhaps only a sub-plot here. I suspect trivially streamable transforms 
may always have a linear performance characteristic by definition but the 
reverse does not hold, i.e. things that a streamable via re-writing or not 
streamable at all may also have linear performance. I will have a closer look 
at that relationship to see if it holds anything we can use.

Thanks,
Kev