Re: [xsl] Ordered union of sequences
2010-04-08 17:26:27
On 08.04.2010 17:50, Michael Müller-Hillebrand wrote:
Am 08.04.2010 um 16:09 schrieb Michael Kay:
It seems to me that you first want to create (or imagine) a graph: in your
example there are arcs k->o, o->p, p->c, a->b, b->c, etc.
Then you want to look for cycles in this graph. If any cycles exist, there
is no solution to your problem.
Thanks Michael, I guess I have to finally understand the graph stuff. I will
turn to the example in the 4th edition p.251 ff. for a starter.
Am 08.04.2010 um 16:47 schrieb Imsieke, Gerrit, le-tex:
[... cool, ready-made example ...]
Does that make sense?
If I include<o/> at the other position, i.e.,
<seq><k/><f/><z/><o/></seq>,
I receive "Too many nested function calls. May be due to infinite recursion."
as expected.
Gerrit, I am at early stages to understand the algorithm. The function counts
the maximum number of preceding siblings on any available path and uses this as
a sort key. This is some sort of creating a graph, backtracing to the
beginning... which is what Michael suggested, isn't it.
Yes. Firstly, the items (all elements below seq) are grouped by element
name.
The groups, i.e., the distinct element names, are sorted by the result
of the my:sortkey() function, applied to each group's first item.
my:sortkey() is calculated as follows:
Consider the group of all c elements. There are two c elements in total,
the first one being in the first sequence, the second one in the last
sequence.
my:sortkey() is initialized with the first c element (and with the empty
sequence as a second argument, discussed below).
Then the variable preceding-siblings is calculated. Maybe this variable
name isn't chosen wisely, since the variable doesn't contain the first
c's preceding siblings. It rather contains the respective first
preceding siblings for each seq/c element. That is, p of the first
sequence and b of the last sequence. For p and for b, my:sortkey() is
calculated, the maximum value of these results is then increased by 1
(since c comes after p or b, its sort key should be higher than p's or
b's sort keys).
For the calculation of p's sort key, o's sort key has to be calculated,
because o is p's predecessor. For b, a's sort key has to be calculated,
which is 1 because the only a element doesn't have a predecessor.
o's sort key is an interesting case: if o is contained at the end of the
3rd sequence, the recursion will calculate z's sort key, then in turn
f's, then in turn k's (=1) and c's, then in turn b's and p's, then in
turn o's and a's (=1). That's where the circle comes into play. The
recursion follows Michael Kay's graph in reverse direction (the nodes in
this graph are not the XML snippet elements, but the unique element
names "o", "p", "b", ...).
There's a path in the graph that origins from "o" and returns to "o",
which amounts to saying that the graph is circular.
In order to detect circularity, the second argument to my:sortkey(),
$seen, contains all element names that have been seen during recursion.
When calling the function the first time for c, $seen is empty. When
calling it for p and for b, $seen is ("c"). When calling the function
for o from p, $seen is ("c", "p"). Then for z it's ("c", "p", "o"), then
for f it's ("c", "p", "o", "z"), then for c it's ("c", "p", "o", "z",
"f") which will terminate with a message that c's position in the
unified sequence cannot be determined.
The fact that actual execution of the template stopped with the message
that o's position isn't determinate is purely by chance. Swap the first
and the second sequence and the template will complain that d's position
isn't determinate. This is understandable: if there are circles in the
graph, then every node that is on a circle has an indeterminate position
on a fictional linear sequence. Which one will actually be reported
depends on document order.
So maybe the information that there's a circle that contains o or d
isn't so helpuful for your users. So I enhanced the reporting:
<xsl:function name="my:sortkey" as="xs:integer">
<xsl:param name="input" as="element(*)" />
<xsl:param name="seen" as="xs:string*" />
<xsl:if test="name($input) = $seen">
<xsl:message terminate="yes">Element
<xsl:value-of select="name($input)"/>
doesn't seem to occur at a determinate
position in a sequence. The circle is
<xsl:value-of select="string-join((name($input), $seen), '-')"/>
</xsl:message>
</xsl:if>
<xsl:variable
name="preceding-siblings"
select="$input/../../seq/*[
name() = name($input)
]/preceding-sibling::*[1]"
as="element(*)*" />
<xsl:sequence select="(
max(
for $ps in $preceding-siblings
return my:sortkey($ps, (name($input), $seen))
) + 1,
1
)[1]"/>
</xsl:function>
Explaining this approach took three times as long as writing the code,
and re-reading your message tells me that I didn't need to explain it
since you already grasped it. But it was fun.
Gerrit
Thanks a lot for the most valuable input!
I need a result and the inconsistency report, so the user can fix the input. I
will be looking into stopping the infinite recursion somehow.
- Michael
There is an arbitrary number of sequences, sometimes
containing items
with the same name:
(k, o, p, c, f)
(d, e, f, g)
(k, f, z, o)
(a, b, c, d)
I want to create a master sequence which contains every item once,
preserving the original order.
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--
<Prev in Thread] |
Current Thread |
[Next in Thread>
|
- [xsl] Ordered union of sequences, Michael Müller-Hillebrand
- Re: [xsl] Ordered union of sequences, Dimitre Novatchev
- RE: [xsl] Ordered union of sequences, Michael Kay
- Re: [xsl] Ordered union of sequences, Vladimir Nesterovsky
- RE: [xsl] Ordered union of sequences, Michael Kay
- Re: [xsl] Ordered union of sequences, Imsieke, Gerrit, le-tex
- Re: [xsl] Ordered union of sequences, Michael Müller-Hillebrand
- Re: [xsl] Ordered union of sequences, Imsieke, Gerrit, le-tex
- Re: [xsl] Ordered union of sequences, Imsieke, Gerrit, le-tex
- Re: [xsl] Ordered union of sequences,
Imsieke, Gerrit, le-tex <=
- Re: [xsl] Ordered union of sequences, Michael Müller-Hillebrand
- Re: [xsl] Ordered union of sequences, Vladimir Nesterovsky
- RE: [xsl] Ordered union of sequences, Michael Kay
- Re: [xsl] Ordered union of sequences, Michael Müller-Hillebrand
Re: [xsl] Ordered union of sequences, Wolfgang Laun
|
|
|