xsl-list
[Top] [All Lists]

Re: [xsl] Does the count() function require access to the whole subtree?

2014-01-16 04:39:39
In explaining why count(//x) is streamable whereas data(//x) is not, David 
Carlisle wrote:

        count() just returns a single value and the 
        system could conceptually go through the 
        document once counting every time it sees x.

        data(//x) is the same as //x/data(.) and returns 
        the data of each element in the sequence. Now 
        the first element in the sequence is the outer x. 
        To work out its data the system has to process 
        the _full_ content of that element (which actually 
        is the whole document). Then the next element 
        of the sequence is the inner x. Oops, we passed 
        that already, so you need to back up to get that.
        This is why the concern about "overlapping trees"
        comes from.

Thanks David, that is an outstanding explanation.

Question: So, what is the general principle at work here? 

I'll take a stab at answering that question: 

        Let's take as reference this XML:

        <Document>
            <x>
                <x>A</x>
                B
            </x>
        </Document>

        Consider an XPath expression that yields a 
        sequence of <x> elements. Now consider an
        operation on that sequence. Is the operation
        on the sequence streamable or not?

        For example, is the operation count() on the
        sequence generated by the XPath expression
        //x streamable or not? Is the operation data()
        on the sequence generated by //x streamable
        or not?

        We must consider two cases:

        Case 1:         One or more items in the sequence has
                has an <x> element nested inside 
                another <x>.

                If the operation can be performed just
                by inspecting each item of the sequence, 
                then the operation on the sequence is 
                streamable.

                If the operation requires going inside each
                item of the sequence, then the operation 
                on the sequence is not streamable.

        Case 2:         There are no items in the sequence that have
                an <x> element nested inside another <x>.
                (Each item in the sequence is disjoint)

                The operation on the sequence is streamable,
                regardless of whether the operation just
                inspects each item or goes inside each item.

Is that correct?

Is it complete? Are there any cases that it misses?

Can you express it more simply and more clearly?

/Roger


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--


<Prev in Thread] Current Thread [Next in Thread>