Re: how to estimate speed of a transformation


Is this advice true? If I need the same 'select' ten (or ten thousand)

times in a program,

isn't it just simpler to write that select and let the implementation

memoize it?

Yes it would be simpler, but what if the application isn't.


What if it is, but by binding 'select' results to global variables instead
of allowing the implementation to cache results when needed, you waste
memory? You cannot deallocate a variable in XSLT, can you? Thus if something
global is bound, it is forever.

Its value will not change since the data is the same. Why should I

manually

'cache' values if the data model allows to do it consistently and

automatically?


I agree if applications would cache it automatically. I wouldn't have to do
it by hand, but I'm don't write parser code and I have no idea how easy or
difficult such a thing would be. However I do know, from other threads
started by you actually, that many possible performance optimizations are
not implemented by all parsers. Such is true, if I understand it correctly
for tail recursion which is optimized by Saxon and jd.xslt but not by other
parsers.


1) For C, the result of % with negative arguments is undefined.  Even if a 
parcticular 
implementation gives desirable result, one should not rely on this experience if
the program is intended for use with more than one compiler or more than one 
version
of the same compiler.

2) For Scheme, an implementation of tail recursion is required. An interpreter
that does not turn tail recursion into a loop does not conform to the Report. 

I am trying to find out what considerations  one can rely upon when designing a
transformation. I've found out that:

+ only direct (self-) tail recursion is properly turned into a loop in most 
implementations
  that support it at all, thus complex loops are problematic, but simple ones 
should work
  well with good implementations;
+ access by position to a nodeset exhibits constant time in advanced 
implementations (and linear in
  others);
+ results of expressions with predicates not referencing current() are memoized 
in many simple cases
  (but I cannot say what the sample case is).

Exploring the source code of an implementation of XSLT is a lot of fun. The 
authors are great engineers.
Could I please know what features I _can_ rely upon? I cannot rely on:

- writing repeating computations with iterators, since no mutable objects;
- writing repeating computations through recursion, since anything but the 
simplest case 
  is not supported anywhere;
- just specifying a predicate in a non-recursive way, since while some 
implementations
  will optimize it, others will compute it each time they meet it;
- subsetting a nodeset, since while in some implementations it is fast, in 
others it
  doubles memory for each removed node and takes considerable time;
- accessing a nodeset by position, since there are implementations with access 
time proportional
  to the position() in a nodeset.

XSLT is a great language. Its restrictions permit implementations which can 
execute fast
both 'declarative' (that is, with expressions describing relations, not 
algorithms -- such
expressions would take long to compute in Java, but can be computed 
economically in XSLT)
and 'functional' (that is, with mutliple calls of chaining computations) 
transformations.

On the other hand,  a basic, yet conforming, implementation (of which a number 
is currently widely used)
makes XSLT unconvenient without language extensions and fast processors. 

What do I have -- at least with good processors? What can I use in my 
stylesheets without
fear to turn constant into quadratic in a next version of the same processor?

David Tolpin




 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list