Re: [xsl] What is Micro Pipelining: an attempt for a definition (was: Calculating cumulative values - Abel's solution)
2007-08-31 16:43:45
Hi Wendell,
Very valuable feedback. However, though I think I grasp your points,
after reading I still find a lot of "micro-pipelining" to be in a gray
area. Let me try to explain myself.
For me, micro-pipelining is defined as the process of pipelining a set
of data during a one pass through an XSLT processor. Normal pipelining
would be to re-apply the results of one pass to the next pass. I
understand that this doesn't tally with your definition below.
Another attempt for a definition (still my own, original, definition)
would be the following: in a case where you apply templates (or for-each
perhaps) to a set of data that is not part of the input data (input data
being any external source, including document() / collection() etc) is a
micro pipeline. Shorter: when you have a variable to contain data that
is already processed and you re-apply this variable, it is a micro pipeline.
Now I do understand that this is a bit of a too wide definition. But now
on to yours. You explain that for anything to be a micro pipeline it
must be applied (or expected to be applied) more than once. More
appropriately: when you can extract the pipeline variable into a global
variable, it is not a micro pipeline anymore (is it then a macro pipeline?).
But things are still not so trivial. It is tempting to say that the
following is only executed once, because there's only one root in a
document, and as a result, it is easy to extract the variable to the
global level:
<xsl:template match="/">
<xsl:variable name="micro">
<xsl:apply-templates select="root/some/data" />
</xsl:variable>
<xsl:apply-templates select="$micro/*" />
</xsl:template>
But often, simple examples, or inquiries on this list, are part of a
larger stylesheet or solution. Suppose I alter the above example to be
as follows:
<xsl:param name="urls" select=" 'one.xml', 'two.xml', 'three.xml' " />
<xsl:template name="main">
<xsl:apply-templates select="document($list-of-documents)" />
</xsl:template>
<xsl:template match="/">
<xsl:variable name="micro">
<xsl:apply-templates select="root/some/data" />
</xsl:variable>
<xsl:apply-templates select="$micro/*" />
</xsl:template>
is it now still not a micro pipeline? It is applied several times (three
times) and it is not possible anymore to make the variable global. Is it
really necessary to restrict a micro pipeline to be one only when it is
applied to a local level? Though I can follow your point in that it is
closer to a "large pipeline" than to a "micro pipeline".
Now let's try the other opposite. To define a term, you need to know
where its boundaries are. Suppose we have the following fictional
stylesheet, would the application of $micro be a micro pipeline?
<xsl:variable name="micro">
this must be tokenized
<xsl:variable>
<xsl:template name="main">
<xsl:apply-templates select="my:tokenize($micro)" />
</xsl:template>
<xsl:template match="my:token">
<xsl:copy-of select="." />
</xsl:template>
<xsl:function name="my:tokenize">
<xsl:param name="tokens">
<xsl:variable name="preproc-tokens">
<xsl:for-each select="tokenize($tokens, ' ')">
<my:token value="{.}" />
</xsl:for-each>
</xsl:variable>
<xsl:apply-templates select="$preproc-tokens/*" mode="my:tokenize" />
</xsl:function>
<xsl:template match="my:token" mode="my:tokenize">
<xsl:copy>
<xsl:sequence select="replace('text(), [^A-z]', '')" />
</xsl:copy>
</xsl:template>
In this tokenize example we see two things. We see a global variable
holding data. This is processed using my:tokenize() and then the results
are re-applied. Even though we are talking about global data (holding a
document node with text), I would consider both phases a micro pipeline:
the apply-templates in the my:tokenize function starts a micro pipeline,
and the apply-templates in the main entry template starts a micro
pipeline. Or not?
I don't know the answers. I have seen the term micro pipeline come up
every now and then without a lot of explanation. I did a quick check on
the internet a couple of times, but a clear definition seems hard to
find. Even the XML Pipeline languages (still in Draft) at W3C do not
mention the distinction. Wikipedia has a small but effective line on the
subject though: "Some standards also categorize transformation as macro
(changes impacting an entire file) or micro (impacting only an element
or attribute)" (http://en.wikipedia.org/wiki/XML_pipeline).
But this simple-enough definition does not help XSLT: we have root
nodes, document nodes, files, non-xml data, result tree fragments,
sequences.... When is it micro and when is it macro? I do think your
definition comes quite close, but it has some rough edges. Would you
(and others) give it a try? (the definition, I mean)?
Cheers,
-- Abel Braaksma
Wendell Piez wrote:
Abel,
Thanks for the very nicely composed explanation for Simon. One of the
best things about this list is how explanations are provided that can
be followed by people who don't themselves have any need for a
particular solution, but who can still learn valuable lessons and
techniques from the sidelines. It's one of the best ways of learning.
I would, however, quarrel with one aspect of your description. You
have the code:
<xsl:template match="/">
<!-- mini pipeline: put points into a variable and process
them -->
<xsl:variable name="points">
<xsl:apply-templates select="root/set[1]/point[1]"
mode="aggregate">
<xsl:with-param name="calc" tunnel="yes">
<for x="1" y2="2" y3="2"/>
</xsl:with-param>
</xsl:apply-templates>
</xsl:variable>
<!-- apply set with pre-processed points -->
<xsl:apply-templates select="root/set">
<xsl:with-param name="points" select="$points" />
</xsl:apply-templates>
</xsl:template>
... which you refer to as using a "micro-pipeline".
As I explained earlier this week, I don't believe this is a
micro-pipeline, since it operates globally. The declaration of $points
could appear outside the template and it would perform the same way. A
micro-pipeline would be if you bound a variable in a template you
expected to fire more than once, and then processed the results of
that variable. Admittedly there may be a grey zone between a pipeline
operating at the document (global) level, and one that operates at a
more local level; but I believe this falls fairly clearly into the
first category.
(Then too, as you also explain, you don't really run a pipeline here
at all, but use the results of pre-processing as a lookup table.)
The reason I stress this is because I'm afraid that if we start
calling anything a micro-pipeline just because it involves some
matching and applying of templates whose results won't appear in the
output, then we'll have to invent a new word for what we actually
invented the term for.
The more general technique, I'd say, is called "pipelining", or -- if
the results are themselves not processed directly (this includes
generating lookup tables) "pre-processing". Another interesting thing
to reflect on is that pipelining and pre-processing can be achieved in
XSLT 1.0 by passing the results of one stylesheet into another as its
source. This really isn't practical with micro-pipelining, which
happens only within the scope of a single branch of the tree at a time.
I apologize if this comes across as rude. Another valuable thing we do
on this list is guard one another's terminology. ("Don't call them
tags", etc.) This keeps the language strong because we have
unambiguous terms to refer to things, methods and techniques, which
can then be discussed and learned about.
Cheers,
Wendell
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--
<Prev in Thread] |
Current Thread |
[Next in Thread> |
- RE: [xsl] Calculating cumulative values - another call for help, (continued)
- RE: [xsl] Calculating cumulative values - another call for help, Simon Shutter
- Re: [xsl] Calculating cumulative values - another call for help, Abel Braaksma
- RE: [xsl] Calculating cumulative values - Abel's solution, Simon Shutter
- Re: [xsl] Calculating cumulative values - Abel's solution, Abel Braaksma
- RE: [xsl] Calculating cumulative values - Abel's solution, Simon Shutter
- RE: [xsl] Calculating cumulative values - Abel's solution, Wendell Piez
- RE: [xsl] Calculating cumulative values - Abel's solution, Simon Shutter
- Re: [xsl] Calculating cumulative values - Abel's solution, Abel Braaksma
- RE: [xsl] Calculating cumulative values - Abel's solution, Simon Shutter
- Re: [xsl] Calculating cumulative values - Abel's solution, Wendell Piez
- Re: [xsl] What is Micro Pipelining: an attempt for a definition (was: Calculating cumulative values - Abel's solution),
Abel Braaksma <=
|
|
|