Hi Joe,
This problem is a bit difficult, not because of any inherent difficulty in
any of the methods you will use, but because it's both complex, and will
require using a couple of XSLT 1.0 tricks. Disentangling it shows a way
forward.
You actually have several problems here:
* Assigning each table or figure its correct number (based on order of
citation, not
order of appearance in the source)
* Citing the tables or figures where xrefs appear in line, each with its
correct number
* Placing the tables and figures each after the paragraph where it's first
cited, and
not elsewhere
You will use keys for this, though not perhaps in exactly the way you're
imagining. Likewise, it'd be nice if we could construct an array, and in
XSLT 2.0 we could (or at least the functional equivalent thereof), but in
XSLT 1.0 we can't. So we have to fake a couple of things. As you'll see,
this faking may potentially get us into a bit of trouble with performance.
The usual XSLT 1.0 approach when this happens is to split a problem into
two or more passes, which generally gives us opportunities to optimize for
efficiency.
In order to simplify this explanation I'm going to assume you have only
tables. Figures will work just the same:
Assigning each table or figure its number ... we could either do this by
counting the references (filtering out for repeated references) or by
counting the tables, sorted by their first reference. While the latter
would be nice, the former is easier in XSLT 1.0. We do this first by giving
us a means to filter out the repeats:
<xsl:key name="tablerefs-by-rid" match="xref[(_at_)ref-type='table']"
use="@rid"/>
Given $rid, we can then get all the references to any table by calling
key('tablerefs-by-rid', $rid)
and the first one only by calling
key('tablerefs-by-rid', $rid)[1]
In addition, we can get all the first references by saying, e.g.
//xref[(_at_)ref-type='table'][count(.|key('tablerefs-by-rid', @rid)[1])=1]
This XPath traverses the entire document from the root, collecting all
xrefs that are the first reference to their table. If they aren't to a
table, the first predicate filters them out. If they are not a first
reference, the count of their union with the first reference will be 2 not
1, and the second predicate will filter them out.
This uses an XPath 1.0 idiom (the count() trick) to test node identity. The
generate-id() function is also sometimes used for this, so this would also
work:
//xref[generate-id()=generate-id(key('tablerefs-by-rid', @rid)[1])]
Notice here we don't need the first predicate (since xrefs not to tables
will also be thrown out by the predicate given -- so you may prefer this.
It would be very convenient to have all these particular nodes collected
together so we don't have to collect them over and over (an expensive
traversal). So:
<xsl:variable name="first-table-refs"
select="//xref[generate-id()=generate-id(key('tablerefs-by-rid',
@rid)[1])]"/>
(This is awfully close to an array, isn't it?)
Consequently, we can also get the proper number for any given
xref[(_at_)ref-type='table'] with the expression
count($first-table-refs
[count(.|current()/preceding::xref) = count(current()/preceding::xref)]) + 1
which looks, and is, awfully obnoxious and expensive (using the costly
preceding:: axis twice), but which can be optimized slightly as a template
call:
<xsl:template match="xref" mode="assign-table-number">
<xsl:for-each select="key('tablerefs-by-rid', @rid)[1]">
<!-- switching context to the first reference to this reference's
table -->
<xsl:variable name="preceding-refs" select="preceding::xref"/>
<xsl:value-of select="count($first-table-refs
[count(.|$preceding-refs) = count($preceding-refs)]) + 1"/>
<!-- counting the first table references before this one, and adding 1 -->
</xsl:for-each>
</xsl:template>
I wish this were easier, but in XSLT 1.0 it just isn't. In 2.0, it is (and
maybe Mike or Jeni or someone will show us how).
But it does solve problem 1, and you can see how any given
xref[(_at_)ref-type='table'] can call <xsl:apply-templates select="."
mode="assign-table-number"/> and get its number, thereby solving problem 2.
Problem 3 is a matter of selecting, after you create a paragraph, those
references in it that are first references to their targets (tables,
figures, what not), which again you can do (in the case of tables) using
this same idiom:
<xsl:template match="para">
<p>
<xsl:apply-templates/>
</p>
<xsl:apply-templates mode="get-target"
select=".//xref[generate-id()=generate-id(key('tablerefs-by-rid',
@rid)[1])]"/>
<!-- do the same with any other keys for xrefs you have, e.g. to
figures, perhaps
unifying the select -->
</xsl:template>
To actually get the target you're going to need another key:
<xsl:key name="target-by-rid" match="table|figure" use="@id"/>
and then
<xsl:template match="xref" mode="get-target">
<xsl:apply-templates select="key('target-by-rid', @rid)"
mode="show"/>
</xsl:template>
which will go apply templates to the table, figure or whatever. Note I've
put this call also in a special mode, "show", enabling you to say in the
default mode
<xsl:template match="table|figure"/>
so the tables only come out where you actually want them.
Whew! not bad for a bit of work, eh?
This should work fine for input at the scale of most human-readable
documents. For higher performance (that numbering is a beast), you'll want
to split out an analytic/sorting pass before processing, or get out the big
rotary saw (XSLT 2.0).
Note: I just typed this up, and haven't tested, but I have used such code
and it works. Beware particularly of missing parentheses in my XPaths, etc.
Cheers,
Wendell
At 04:57 PM 10/13/2004, you wrote:
XSL list:
I'm developing a stylesheet that converts XML to html to display research
articles. The articles contains three citation types, bibliographical,
table call, and figure call. Upon encountering a table call or figure
call, I would like to display the table or figure referred to immediately
following the paragraph that contains the call. I want the table or figure
to appear in the order they were referred to in the paragraph and I want
each table or figure to only appear once in the outputted document. Tables
and figures are numbered in order of their reference, though at any point
you can refer to a table or figure that has been previously called.
Citations look like this:
<xref ref-type="bibr" rid="B1">1</xref>
<xref ref-type="table" rid="T1">Table 1</xref>
<xref ref-type="fig" rid="F1">Figure 1</xref>
Sample Input:
[A paragraph that includes a citation for Table 1.]
[A paragraph that includes citations for Table 2, Table 1, Figure 1, and
Table 3.]
Sample Output:
[A paragraph that includes a citation for Table 1.]
Table 1
[A paragraph that includes citations for Table 2, Table 1, Figure 1, and
Table 3.]
Table 2
Figure 1
Table 3
My initial thought is to create a set of keys:
Key: Last Table Processed
Key: Last Figure Processed
Key: Last Table Encountered
Key: Last Figure Encountered
Since the tables and figures are numbered in order, a comparison of the
two keys should be in order. This comparison should be made at the end of
processing a paragraph. However, I'm not quite sure how I'd make such a
comparison or even if I can use keys in that manner. I'm thinking I might
need to generate some sort of array to keep track of the multiple
citations encountered so that in the sample provided the output is (Table
2, Figure 1, Table 3) and not (Table 2, Table 3, Figure 1) or (Figure 1,
Table 2, Table 3). If I were to build an array, since at this point I
don't need to process <xref> citations of "bibr" type, those should be
ignored. Any suggestions would be greatly appreciated. Thanks.
> - Joe
>
--+------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--+--
======================================================================
Wendell Piez
mailto:wapiez(_at_)mulberrytech(_dot_)com
Mulberry Technologies, Inc. http://www.mulberrytech.com
17 West Jefferson Street Direct Phone: 301/315-9635
Suite 207 Phone: 301/315-9631
Rockville, MD 20850 Fax: 301/315-8285
----------------------------------------------------------------------
Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================