xsl-list
[Top] [All Lists]

Re: Displaying Code Dependent on First Encounter of Specific Reference

2004-10-14 10:02:24
Hi Joe,

This problem is a bit difficult, not because of any inherent difficulty in any of the methods you will use, but because it's both complex, and will require using a couple of XSLT 1.0 tricks. Disentangling it shows a way forward.

You actually have several problems here:

* Assigning each table or figure its correct number (based on order of citation, not
  order of appearance in the source)
* Citing the tables or figures where xrefs appear in line, each with its correct number * Placing the tables and figures each after the paragraph where it's first cited, and
  not elsewhere

You will use keys for this, though not perhaps in exactly the way you're imagining. Likewise, it'd be nice if we could construct an array, and in XSLT 2.0 we could (or at least the functional equivalent thereof), but in XSLT 1.0 we can't. So we have to fake a couple of things. As you'll see, this faking may potentially get us into a bit of trouble with performance. The usual XSLT 1.0 approach when this happens is to split a problem into two or more passes, which generally gives us opportunities to optimize for efficiency.

In order to simplify this explanation I'm going to assume you have only tables. Figures will work just the same:

Assigning each table or figure its number ... we could either do this by counting the references (filtering out for repeated references) or by counting the tables, sorted by their first reference. While the latter would be nice, the former is easier in XSLT 1.0. We do this first by giving us a means to filter out the repeats:

<xsl:key name="tablerefs-by-rid" match="xref[(_at_)ref-type='table']" 
use="@rid"/>

Given $rid, we can then get all the references to any table by calling

key('tablerefs-by-rid', $rid)

and the first one only by calling

key('tablerefs-by-rid', $rid)[1]

In addition, we can get all the first references by saying, e.g.

//xref[(_at_)ref-type='table'][count(.|key('tablerefs-by-rid', @rid)[1])=1]

This XPath traverses the entire document from the root, collecting all xrefs that are the first reference to their table. If they aren't to a table, the first predicate filters them out. If they are not a first reference, the count of their union with the first reference will be 2 not 1, and the second predicate will filter them out.

This uses an XPath 1.0 idiom (the count() trick) to test node identity. The generate-id() function is also sometimes used for this, so this would also work:

//xref[generate-id()=generate-id(key('tablerefs-by-rid', @rid)[1])]

Notice here we don't need the first predicate (since xrefs not to tables will also be thrown out by the predicate given -- so you may prefer this.

It would be very convenient to have all these particular nodes collected together so we don't have to collect them over and over (an expensive traversal). So:

<xsl:variable name="first-table-refs"
select="//xref[generate-id()=generate-id(key('tablerefs-by-rid', @rid)[1])]"/>

(This is awfully close to an array, isn't it?)

Consequently, we can also get the proper number for any given xref[(_at_)ref-type='table'] with the expression

count($first-table-refs
  [count(.|current()/preceding::xref) = count(current()/preceding::xref)]) + 1

which looks, and is, awfully obnoxious and expensive (using the costly preceding:: axis twice), but which can be optimized slightly as a template call:

<xsl:template match="xref" mode="assign-table-number">
  <xsl:for-each select="key('tablerefs-by-rid', @rid)[1]">
<!-- switching context to the first reference to this reference's table -->
    <xsl:variable name="preceding-refs" select="preceding::xref"/>
    <xsl:value-of select="count($first-table-refs
      [count(.|$preceding-refs) = count($preceding-refs)]) + 1"/>
    <!-- counting the first table references before this one, and adding 1 -->
  </xsl:for-each>
</xsl:template>

I wish this were easier, but in XSLT 1.0 it just isn't. In 2.0, it is (and maybe Mike or Jeni or someone will show us how).

But it does solve problem 1, and you can see how any given xref[(_at_)ref-type='table'] can call <xsl:apply-templates select="." mode="assign-table-number"/> and get its number, thereby solving problem 2.

Problem 3 is a matter of selecting, after you create a paragraph, those references in it that are first references to their targets (tables, figures, what not), which again you can do (in the case of tables) using this same idiom:

<xsl:template match="para">
  <p>
    <xsl:apply-templates/>
  </p>
  <xsl:apply-templates mode="get-target"
select=".//xref[generate-id()=generate-id(key('tablerefs-by-rid', @rid)[1])]"/> <!-- do the same with any other keys for xrefs you have, e.g. to figures, perhaps
       unifying the select -->
</xsl:template>

To actually get the target you're going to need another key:

<xsl:key name="target-by-rid" match="table|figure" use="@id"/>

and then

<xsl:template match="xref" mode="get-target">
  <xsl:apply-templates select="key('target-by-rid', @rid)"
    mode="show"/>
</xsl:template>

which will go apply templates to the table, figure or whatever. Note I've put this call also in a special mode, "show", enabling you to say in the default mode

<xsl:template match="table|figure"/>

so the tables only come out where you actually want them.

Whew! not bad for a bit of work, eh?

This should work fine for input at the scale of most human-readable documents. For higher performance (that numbering is a beast), you'll want to split out an analytic/sorting pass before processing, or get out the big rotary saw (XSLT 2.0).

Note: I just typed this up, and haven't tested, but I have used such code and it works. Beware particularly of missing parentheses in my XPaths, etc.

Cheers,
Wendell

At 04:57 PM 10/13/2004, you wrote:
XSL list:

I'm developing a stylesheet that converts XML to html to display research articles. The articles contains three citation types, bibliographical, table call, and figure call. Upon encountering a table call or figure call, I would like to display the table or figure referred to immediately following the paragraph that contains the call. I want the table or figure to appear in the order they were referred to in the paragraph and I want each table or figure to only appear once in the outputted document. Tables and figures are numbered in order of their reference, though at any point you can refer to a table or figure that has been previously called.

Citations look like this:
<xref ref-type="bibr" rid="B1">1</xref>
<xref ref-type="table" rid="T1">Table 1</xref>
<xref ref-type="fig" rid="F1">Figure 1</xref>

Sample Input:

[A paragraph that includes a citation for Table 1.]
[A paragraph that includes citations for Table 2, Table 1, Figure 1, and Table 3.]

Sample Output:

[A paragraph that includes a citation for Table 1.]

Table 1

[A paragraph that includes citations for Table 2, Table 1, Figure 1, and Table 3.]

Table 2
Figure 1
Table 3

 My initial thought is to create a set of keys:
Key: Last Table Processed
Key: Last Figure Processed
Key: Last Table Encountered
Key: Last Figure Encountered

Since the tables and figures are numbered in order, a comparison of the two keys should be in order. This comparison should be made at the end of processing a paragraph. However, I'm not quite sure how I'd make such a comparison or even if I can use keys in that manner. I'm thinking I might need to generate some sort of array to keep track of the multiple citations encountered so that in the sample provided the output is (Table 2, Figure 1, Table 3) and not (Table 2, Table 3, Figure 1) or (Figure 1, Table 2, Table 3). If I were to build an array, since at this point I don't need to process <xref> citations of "bibr" type, those should be ignored. Any suggestions would be greatly appreciated. Thanks.


> - Joe
>

--+------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--+--


======================================================================
Wendell Piez                            
mailto:wapiez(_at_)mulberrytech(_dot_)com
Mulberry Technologies, Inc.                http://www.mulberrytech.com
17 West Jefferson Street                    Direct Phone: 301/315-9635
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
----------------------------------------------------------------------
  Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================



<Prev in Thread] Current Thread [Next in Thread>