xsl-list
[Top] [All Lists]

RE: [xsl] Better Way to Group Siblings By Start/End Markers?

2008-06-23 16:04:36
Another possibility is to use xsl:for-each-group with group-starting-with.

I seem to remember that when I last did this, however, it turned out to be
easier using sibling recursion - that is, have each w:r element
apply-templates to its immediately following sibling.

Either way, processing Word XML using XSLT is not for the faint-hearted.

Michael Kay
http://www.saxonica.com/ 

-----Original Message-----
From: Eliot Kimber [mailto:ekimber(_at_)reallysi(_dot_)com] 
Sent: 23 June 2008 23:04
To: xsl-list
Subject: [xsl] Better Way to Group Siblings By Start/End Markers?

I am experimenting with using XSLT to convert Office Open XML 
into InCopy INCX (the CS3 Word import fails to capture some 
things I need captured from the Word data).

One challenge is handling Word fields, which need to be 
converted to any number of different, and 
differently-structured, INCX constructs (whose details are 
not important here).

A Word field is organized as a sequence of w:r elements 
within a larger sequence of w:r elements. A field start is 
indicated by a w:r with a field start indicator and the field 
end is indicated by another w:r with a field end indicator. 
The w:r elements between these two marker elements comprise 
the field data, which can be any number of things, including 
w:r elements that would easily occur outside the scope of the 
field (e.g., w:r containing literal document content).

Here is a typical sample:

      <w:r>
        <w:t xml:space="preserve">-  </w:t>
      </w:r>
      <w:r
        w:rsidR="00BA1D13">
        <w:fldChar
          w:fldCharType="begin"/>
      </w:r>
      <w:r
        w:rsidR="00BA1D13">
        <w:instrText>HYPERLINK "http://www.example.com/";</w:instrText>
      </w:r>
      <w:r
        w:rsidR="00BA1D13">
        <w:fldChar
          w:fldCharType="separate"/>
      </w:r>
      <w:r
        w:rsidRPr="00B233E5">
        <w:t>HTTP</w:t>
      </w:r>
      <w:r
        w:rsidR="00BA1D13">
        <w:fldChar
          w:fldCharType="end"/>
      </w:r>

I have this for-each-group that seems to group correctly, but 
I'm wondering if there's a simpler expression that does what I want:

<xsl:for-each-group select="w:r"
group-adjacent="
string(self::*[w:fldChar[(_at_)w:fldCharType = 'begin' or 
@w:fldCharType = 'end']] or 
(self::*[preceding-sibling::*/w:fldChar[(_at_)w:fldCharType = 
'begin']] and 
self::*[following-sibling::*/w:fldChar[(_at_)w:fldCharType = 
'end']] and 
count((self::*[preceding-sibling::*/w:fldChar[(_at_)w:fldCharType 
= 
'begin']])[1]/(*[following-sibling::*/w:fldChar[(_at_)w:fldCharType
 = 'end']])[1]
|
(self::*[following-sibling::*/w:fldChar[(_at_)w:fldCharType = 
'end']])[1]) = 1
))
   "


In prose (at least this is what I intend the above expression 
to mean): if w:r has child w:fldChar where @w:fldCharType = 
'begin' or 'end' or w:r has both a preceding sibling w:r with 
a w:fldChar of type 'begin' and a following sibling w:r with 
a w:fldChar of type 'end' AND the nearest preceding sibling 
field start has the same nearest following sibling field end 
as the current node, then return the grouping "true" else 
return the grouping key "false".

Whew.

I can't think of a simpler way to say this. Is there one?

I realize I could factor some of the complexity of the 
expression out into a function or two, which I will probably do.

Thanks,

Eliot

----
Eliot Kimber | Senior Solutions Architect | Really Strategies, Inc.
email:  ekimber(_at_)reallysi(_dot_)com 
<mailto:ekimber(_at_)reallysi(_dot_)com>
office: 610.631.6770 | cell: 512.554.9368 2570 Boulevard of 
the Generals | Suite 213 | Audubon, PA 19403 www.reallysi.com 
<http://www.reallysi.com>  | http://blog.reallysi.com 
<http://blog.reallysi.com> | www.rsuitecms.com 
<http://www.rsuitecms.com> 


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--



--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--

<Prev in Thread] Current Thread [Next in Thread>