xsl-list
[Top] [All Lists]

RE: parsing large xml files using Saxon 6.5.2

2003-08-11 09:00:57
So what's the difference between the 18.1Mb run that ran "for hours",
and the 19.2Mb run that ran in 26 seconds? Somewhere there is a
significant difference that explains the problem, and you haven't given
us enough information to find it.

Running with the -T option can be useful. It will produce far more
information than you can analyse, and will slow down processing
considerably, but it should give you some indication as to whether the
processing is hung, looping, or just doing a lot of work.

The evidence of your measurements is that the stylesheet's performance
is essentially linear.

I would advise, by the way, moving off Instant Saxon to full Saxon for
any serious work. The Microsoft Java VM is now a thing of the past, so
any benefits that Instant Saxon once offered have pretty well
disappeared.

Michael Kay

-----Original Message-----
From: owner-xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com 
[mailto:owner-xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com] On Behalf Of 
marina
Sent: 11 August 2003 13:36
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: [xsl] parsing large xml files using Saxon 6.5.2


Hi,

I am having problems parsing some xml files.I have a
1ghz processor and 256Meg Ram.

The xslt stylesheet "wordgroup.xsl" from Dimitri
(thankyou!) wwas tested
and worked perfectly on smaller test files. When I run
it on a larger file
"1cl.xml" = 18.1Mb it builds the tree for 
str-Split-to-words.xsl and then sits there for hours.

See output below.

--------------------------------------------------------------
-------------------
Microsoft Windows 2000 [Version 5.00.2195]
(C) Copyright 1985-2000 Microsoft Corp.

h:\saxon\testbed>saxon -t -o output.txt 1cl.xml
wordgroup.xsl
SAXON 6.5.2 from Michael Kay
Java version 1.1.4
Preparation time: 371 milliseconds
Processing file:/h:/saxon/testbed/1cl.xml
Building tree for file:/h:/saxon/testbed/1cl.xml using
class com.icl.saxon.tinyt
ree.TinyBuilder
Tree built in 7070 milliseconds
Building tree for
file:/h:/saxon/testbed/strSplit-to-Words.xsl using
class com.i
cl.saxon.tinytree.TinyBuilder
Tree built in 10 milliseconds

--------------------------------------------------------------
-------------------


So I made another xml file "little.xml" by pasting
sections of 1cl.xml in different sizes to see 

where it was having problems processing.

little.xml = 1.4Mb time = 1.2sec
little.xml = 4.4Mb time = 3.3 sec
little.xml = 7.3Mb time = 6 sec
little.xml = 10.3Mb time = 9.8 sec
little.xml = 19.2 Mb (bigger than the file I want to
parse!) time = 26.1 sec! (see nice output 

below)


h:\saxon\testbed>saxon -t -o output.txt little.xml 
wordgroup.xsl SAXON 6.5.2 from Michael Kay Java version 1.1.4 
Preparation time: 701 milliseconds Processing 
file:/h:/saxon/testbed/little.xml Building > tree for 
file:/h:/saxon/testbed/little.xml using class 
com.icl.saxon.ti nytree.TinyBuilder Tree built in 7912 
milliseconds Building tree for 
file:/h:/saxon/testbed/strSplit-to-Words.xsl > using class 
com.i cl.saxon.tinytree.TinyBuilder Tree built in 20 
milliseconds Execution time: 26178 milliseconds

Any ideas for me to try?

Thanks

Marina



__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design 
software http://sitebuilder.yahoo.com

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list



 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list



<Prev in Thread] Current Thread [Next in Thread>