The str:tokenize() function defined in EXSLT constructs a tree containing
one element for each token. Unless the implementation is clever enough to
construct a virtual or lazy tree, this is going to take a fair bit of
memory.
By contrast, the XPath 2.0 tokenize() function returns a sequence of
strings, and it's a reasonable bet that any decent implementation is going
to be pipelined, so that it reads off the tokens one at a time as they are
needed.
Michael Kay
http://www.saxonica.com/
-----Original Message-----
From: Richard Zhang [mailto:richard_zhang(_at_)anabus(_dot_)com]
Sent: 10 January 2006 14:30
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: [xsl] Memory problem when stokenize big data
Thanks for your reply to my prior question about breaking
down strings.
Now I am trying to use stokenize to breakdown a big data.
The input big data is like:
<textdata sep=" 

">
5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9
...
...
</textdata>
...
...
<textdata sep=" 

">
5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9
...
...
</textdata>
and my xsl template is like:
<xsl:template match="textdata">
<data>
<xsl:for-each select="str:tokenize(.,' 

')">
<e>
<xsl:value-of select="."/>
</e>
</xsl:for-each>
</data>
</xsl:template>
The textdata can be very big. My question is, will the
stokenzing have
problem when handling big data? if yes, how big is the data
that stokenize
can handle? I ran the transformation in Jbuilder and it shows
some '10mb
help left' problem.
Thanks a lot.
Richard
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail:
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--