xsl-list
[Top] [All Lists]

Re: [xsl] Running the same transformation on many input files, optimisation possible?

2019-12-15 11:49:43
I would definitely use the `collection()` function, then would try to
process the documents in parallel using the `saxon:threads` extension
attributes with a value dependent on the number of cores on the machine.

https://www.saxonica.com/html/documentation/extensions/attributes/threads.html


Trying to generalize this a little bit further, if we have N machines, we
could send N   HTTP requests (why not using the document() function) giving
each machine a non-overlapping pattern for the set of files it should
process.

Of course, besides using `collection()` extensively, I haven't ever tried
the other stuff I proposed above -- would be really interesting to try.

Cheers,
Dimitre

On Sun, Dec 15, 2019 at 1:02 AM Trevor Nicholls 
trevor(_at_)castingthevoid(_dot_)com <
xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:

Hi



An application I am working on contains a large number of source documents
which are all run through the same series of transformations. While
initially the build process didn't take long the cost of repeatedly
initialising the XSL processor soon adds up, so I am looking at ways to
streamline it.



Our processor of choice is Saxon (currently we are using 8.7.3) so I can
shift this question to the Saxon list if there are extensions there that
are relevant.



So the question; given a script that essentially includes the following:



cd documents

for d in `cat dlist`; do

  cd $d

  for f in `cat flist`; do

    java -jar $SAXONDIR/saxon8.jar  -o  $f.new.xml  $f.xml
 $SCRIPTDIR/transform.xsl  doc=$d  file=$f

  done

done



is there a mechanism which would allow a single Java process to perform
the equivalent?



Thanks

T


XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list>
EasyUnsubscribe <http://lists.mulberrytech.com/unsub/xsl-list/782854> (by
email <>)



-- 
Cheers,
Dimitre Novatchev
---------------------------------------
Truly great madness cannot be achieved without significant intelligence.
---------------------------------------
To invent, you need a good imagination and a pile of junk
-------------------------------------
Never fight an inanimate object
-------------------------------------
To avoid situations in which you might make mistakes may be the
biggest mistake of all
------------------------------------
Quality means doing it right when no one is looking.
-------------------------------------
You've achieved success in your field when you don't know whether what
you're doing is work or play
-------------------------------------
To achieve the impossible dream, try going to sleep.
-------------------------------------
Facts do not cease to exist because they are ignored.
-------------------------------------
Typing monkeys will write all Shakespeare's works in 200yrs.Will they write
all patents, too? :)
-------------------------------------
Sanity is madness put to good use.
-------------------------------------
I finally figured out the only reason to be alive is to enjoy it.
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--
<Prev in Thread] Current Thread [Next in Thread>