xsl-list
[Top] [All Lists]

Re: [xsl] XSLT 3.0 streaming vs other big-data technologies

2018-06-13 02:30:08
There's nothing in the language spec that constrains where the data comes from.

In the Saxon (Java)  implementation, it can come from any Java InputStream. 
Constructing an InputStream that reads from multiple storage nodes or an HDFS 
file system is someone else's job, but I see no reason why it should be 
difficult.

The Saxon implementation does have some limits that mean the input stream can't 
be infinite: most obviously, the nodes are numbered using a 32-bit integer. 
That one is easily fixed, but it's hard to verify that there aren't others.

(More generally, I've been surprised that I've seen very little discussion 
about how Java and C# cope with the 32-bit limit, e.g. on array indexing. The 
Streams API seems part of the solution, but it certainly doesn't solve the 
whole problem.)

Michael Kay
Saxonica

On 13 Jun 2018, at 07:03, Mukul Gandhi 
gandhi(_dot_)mukul(_at_)gmail(_dot_)com 
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:

Hi all,
   Most of us might be knowing big-data technologies like Hadoop, HDFS etc. 
With HDFS, I think a file can span multiple storage nodes (this potentially 
allows real big data as input to run-time processes).
Can XSLT 3.0 streaming, also accept big data of this kind as input to an XSLT 
transform (i.e the input XML/text document spanning multiple storage nodes)? 
Or by design, XSLT 3.0 can only have big XML file input, that can be stored 
entirely on one storage node?
Can we also say, that XSLT 3.0 can work over the HDFS file system, to allow 
big-data spanning multiple storage nodes?



-- 
Regards,
Mukul Gandhi
XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list>
EasyUnsubscribe <-list/293509> (by email <>)
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--
<Prev in Thread] Current Thread [Next in Thread>