xsl-list
[Top] [All Lists]

Re: [xsl] Approach to transform 250GB xml data

2014-09-10 10:14:56
About:

others are impossible (e.g. sorting).

I beg to differ. At XML London 2014, I demonstrated an approach to use a 
two-pass method with slicing and then xsl:merge, to do a streamable sort.  
Indeed, it is impossible with just standard features to do it in one pass, but 
still, in terms of "possible" or "impossible", with just XSLT and the 
willingness to do it in just two passes (which I believe is not much for such a 
complex task), you can.

And in a situation that the number of required slices can be determined 
statically and can be selected (i.e., as in selecting all lines starting with 
"a" or "b"), and are manageable in memory, you could conjure up a solution with 
xsl:fork and so it in one pass (assuming your processor supports xsl:fork the 
way it is intended, the spec does not require a one-pass approach there and I 
doubt it even possible in all streaming scenarios).

Cheers,

Abel Braaksma
Exselt XSLT 3.0 streaming processor
http://exselt.net

-----Original Message-----
From: Michael Kay mike(_at_)saxonica(_dot_)com [mailto:xsl-list-
service(_at_)lists(_dot_)mulberrytech(_dot_)com]
Sent: Wednesday, September 10, 2014 10:12 AM
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: Re: [xsl] Approach to transform 250GB xml data

It is not practical to transform this using XSLT except by use of a streaming
XSLT processor such as Saxon-EE, and even then it depends on the detailed
nature of the transformation to be performed. Some transformations are
readily streamed (e.g. renaming all the elements), others are impossible (e.g.
sorting). Tell us more about what the transformation is doing.

Michael Kay
Saxonica
mike(_at_)saxonica(_dot_)com
+44 (0) 118 946 5893




On 10 Sep 2014, at 08:36, Vishnu vishnu(_at_)innodata(_dot_)com <xsl-list-
service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:

Hi,

I have approx 250GB xml data and I want to transform it using XSLT 2.0.
What should be the best approach to transform this database.

I tried it with ANT but it gave me JAVA heap space error message.

Please suggest.

Thanks!

Vishnu Singh
"This e-mail and any attachments transmitted with it are for the sole use of
the intended recipient(s) and may contain confidential , proprietary or
privileged information. If you are not the intended recipient, please contact
the sender by reply e-mail and destroy all copies of the original message. Any
unauthorized review, use, disclosure, dissemination, forwarding, printing or
copying of this e-mail or any action taken in reliance on this e-mail is 
strictly
prohibited and may be unlawful."


--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--

<Prev in Thread] Current Thread [Next in Thread>