xsl-list
[Top] [All Lists]

Re: [xsl] Template matching preceding-sibling.

2007-11-20 20:51:43
Wendell,

In line with what you say -- experience would suggest that the best way of approaching communications on the list (or any email discussion list for that matter) is to regard each message as being much like an XSLT template. Each message has a context, yes, but what it can assume from that context is carefully defined, and does not include what called it -- the control flow, if you like -- which is to say, whatever earlier messages might or might not have provoked it.

It's as if "XSLT" as a problem domain were a huge shaggy document coming to us in a big "push", all our messages were templates trying to process bits of it, and the output is the same as the input only transformed from problem statements into solutions.

As you know, XSLT (the technology, not the analogy) is side-effect-free and "functional" as opposed to procedural. This means that nothing can be assumed about processing context except what is given (the context node, variable bindings in scope, etc.). This is why (stepping over to the analogy) echoing back selected bits of earlier messages is so useful, even when it is repetitive. It means that a new process (a.k.a. "reader" or "contributor") can operate on the problem without having to run the entire thing back from the beginning. Repeating relevant bits of earlier messages is a way of parameterizing the thread so that each message to the list can be properly and cleanly encapsulated.

This makes the entire operation much more robust, as well as better supporting an arbitrary number of processors (readers) each doing what it does best, in a kind of "just in time" architecture.

More practically, this suggests both why recapitulating and repeating relevant bits of information is good etiquette, while recapitualing and repeating irrelevant bits is bad form. Similarly, cutting down examples to the minimum is really helpful since it optimizes processing each message. And posts which follow all the rules of good form have significantly faster throughput, there being less noise that has to be ignored (and ignoring noise takes work).

Likewise, saying "see the earlier message in the thread please" is an architectural violation, however politely requested, inasmuch as it asks to depend on a side-effect. Do you really want to exclude any reader who doesn't remember the earlier message or never read it, or doesn't have the time or patience to go back? That will be a significant number of us. Of course, we're human beings, so we can adapt. But one shouldn't expect the same performance out of the collective brain (the Aggegrated Wetware Processor) when it has to scramble to adapt as when it is able to run at its best.

Now that's what I call a deep insight into list philosophy! :)

Having read and followed the basic list posting guidelines I thought that I was doing it all right - now I see how wrong I was. I believe, your witty analogy of the performance of the list and XSLT processors will not let me make mistakes that I did in future postings here, thank you very much!

Now, as to the question itself: I admit I've tried more than once to determine what you're asking for, and I'm still somewhat confused. I think the reason for this is that it's not clear whether the issue remaining (if there is one! :-) regards the XPath expression, or whether XPath processors are working properly (getting the same results for the same input), or other issues, for example regarding the fit between possible solutions and actual problems.

When I see

match="Rec[activity != preceding-sibling::Rec/activity
           or not(preceding-sibling::Rec)]"

I understand the pattern (my brain has an XPath processor in it :-) to match any 'Rec' element that either has an 'activity' child not equal to the 'activity' child of a preceding-sibling 'Rec', or has no preceding sibling 'Rec' to look at.

So given this input:

<Rec>
  <activity>swimming</activity>
</Rec>
<Rec>
  <activity>biking</activity>
</Rec>
<Rec>
  <activity>boating</activity>
</Rec>
<Rec>
  <activity>swimming</activity>
</Rec>

The first Rec matches the pattern, since it has no preceding sibling Rec (so it passes the second test).

The Rec[4] also matches the pattern (by passing the first test), as it has preceding siblings whose 'activity' value is not equal its own ('swimming' is not equal to 'biking' and not equal to 'boating'). Similarly, Rec[3] also matches the pattern, for the same reason ('boating' is not equal to 'biking' or 'swimming').

The only one that does not match is Rec[2] -- it fails the first test, not having a preceding sibling with a different value. And it fails the second, since it has a preceding sibling.

Like other readers of the thread, I'm somewhat mystified as to why would want this behavior. One can easily envision a requirement to leave Rec[4] out, but not one to leave Rec[2] out but include Rec[4].

Moreover, although I think you've tried to reassure me :-), I'm not convinced whether you actually want this behavior, or actually want the more normal case (include Rec[2] and exclude Rec[4]), except you just haven't come across a test case which introduces the problem with it. Or perhaps, like many contributors, you're interested primarily in the academic question (which is fine).

Actually, you are right about academic question. This XPath, along with the whole XSLT template and input was written not by me, I was just testing it (mainly, because I spot this tricky operator != and didn't understand immediately how it works, so other members' responses and especially your detailed case-by-case analysis revealed its peculiarity to me perfectly, thank you). And yes, as it can be deduced from original author's post, the desired behaviour is to include Rec[2] and exclude Rec[4]:

<xsl:template match="Rec[activity != preceding-sibling::Rec/activity
or not(preceding-sibling::Rec)]">
 Ello xslers.
</xsl:template>

<xml>
   <Rec>
     <activity>hi</activity>
   </Rec>
   <Rec>
      <activity>hi</activity>
   </Rec>
</xml>

----

Above should only print anything once,...

Now, the only thing I have to say abouth your example is: shouldn't Rec[2] match also, since it HAS preceding sibling with activity value different to its own, i.e. Rec[1] (since swimming != biking)? The output of the template below applied to your input suggests I'm right:

TEMPLATE:
<?xml version="1.0"?>

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"; version="2.0">

<xsl:template match="Rec[activity != preceding-sibling::Rec/activity
           or not(preceding-sibling::Rec)]">
 <xsl:value-of select="activity"/>
</xsl:template>

</xsl:stylesheet>
------------

OUTPUT:
<?xml version="1.0" encoding="UTF-8"?>
swimming
biking
boating
swimming
-------------



Back to you :)

Which goes back to the usefulness of clarifying the issue by producing small examples. We've snipped bits of XPath and discussed their differences in natural language. But we haven't actually looked (as far as I know :-) at a test case that would dramatize those differences and show why one would prefer one to the other.

We haven't, however, found any cases where processors behave differently given the same input, which also seems to have been a concern.

Well, for me this case is simple - I got this input-output pair:

INPUT
<xml>
  <Rec>
    <activity>hi</activity>
  </Rec>
  <Rec>
     <activity>hi</activity>
  </Rec>
</xml>

OUTPUT
 Ello xslers.

     hi

TEMPLATE
<xsl:template match="Rec[activity != preceding-sibling::Rec/activity
or not(preceding-sibling::Rec)]">
 Ello xslers.
</xsl:template>

While the original post's author for the same input got (he didn't provide exact output, just text description of it):
Above should only print anything once, but I'm getting it each time.

I hope, I made the difference now obvious - I got 1 string "Ello xslers" and the post author got 2 of them, for the same input processed with one and the same template. That is exactly the reason why I made the request to the list. Now it's obvious that the XPath and the whole template works fine and my processor is giving correct result, I think this case can be closed now. To me, the only arguable question is how the original post's author could manage to have 2 strings :). As I said, I believe he just posted untested code sample which he deduced from his more general and complicated input and output, where actualy the problem did exist.

Cheers,
Wendell

Thank you, Wendell, for following the thread, your time and efforts, I appreciate this very much!

Regards,
Ilya

PS: If this time I fail once more to make my point clear then I just don't know how to express it in other way:) I understand, that this is my lack of English and explanation ability and by no mean not lack of the members' comprehension ability.



MY VERY FIRST POST - MIGHT BE USEFUL
Hi, list!
I'm asking for clarification of the topic being on the list several days ago (Nov 8). Steve <subsume(_at_)gmail(_dot_)com> wrote:

I'm missing something fundamental, what is it?

<xsl:template match="Rec[activity != preceding-sibling::Rec/activity
or not(preceding-sibling::Rec)]">
 Ello xslers.
</xsl:template>

<xml>
   <Rec>
     <activity>hi</activity>
   </Rec>
   <Rec>
      <activity>hi</activity>
   </Rec>
</xml>

----

Above should only print anything once, but I'm getting it each time.
What am I not getting?

Charles Knell and Scott Trenda gave advices/made notices, thus confirming, that the problem exists. But copying the input data and the template given by Steve and running it I obtained exactly the result required by him in the first topic-forming message, if I understood his request correctly. I used MSXML (not sure about version, but don't think it's important in this simple case) and Saxon 9, for the latter I also tried changing template version to 2.0 - the result stood the same. Data follows:

-----------
input.xml:
<?xml version="1.0"?>

<xml>
  <Rec>
    <activity>hi</activity>
  </Rec>
  <Rec>
     <activity>hi</activity>
  </Rec>
</xml>

-----------
input.xsl:
<?xml version="1.0"?>

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"; version="1.0">

<xsl:template match="Rec[activity != preceding-sibling::Rec/activity
or not(preceding-sibling::Rec)]">
 Ello xslers.
</xsl:template>

</xsl:stylesheet>

-----------
output.xml:
<?xml version="1.0" encoding="UTF-8"?>

 Ello xslers.

     hi

-----------

As it is clearly seen the string "Ello xslers" appears only once and not twice as Steve wrote ("hi" comes from the defaul copy rules, as I understand it).

Did I missed or misunderstood something very basic? Can someone please reveal the trick to me (especially interested in Charles' oppinion as he was involved most intensively in the discussion).

Many thanks in advance,
Ilya

Cheers,
Ilya


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--

<Prev in Thread] Current Thread [Next in Thread>