Another possibility is to use xsl:for-each-group with group-starting-with.
I seem to remember that when I last did this, however, it turned out to be
easier using sibling recursion - that is, have each w:r element
apply-templates to its immediately following sibling.
Either way, processing Word XML using XSLT is not for the faint-hearted.
Michael Kay
http://www.saxonica.com/
-----Original Message-----
From: Eliot Kimber [mailto:ekimber(_at_)reallysi(_dot_)com]
Sent: 23 June 2008 23:04
To: xsl-list
Subject: [xsl] Better Way to Group Siblings By Start/End Markers?
I am experimenting with using XSLT to convert Office Open XML
into InCopy INCX (the CS3 Word import fails to capture some
things I need captured from the Word data).
One challenge is handling Word fields, which need to be
converted to any number of different, and
differently-structured, INCX constructs (whose details are
not important here).
A Word field is organized as a sequence of w:r elements
within a larger sequence of w:r elements. A field start is
indicated by a w:r with a field start indicator and the field
end is indicated by another w:r with a field end indicator.
The w:r elements between these two marker elements comprise
the field data, which can be any number of things, including
w:r elements that would easily occur outside the scope of the
field (e.g., w:r containing literal document content).
Here is a typical sample:
<w:r>
<w:t xml:space="preserve">- </w:t>
</w:r>
<w:r
w:rsidR="00BA1D13">
<w:fldChar
w:fldCharType="begin"/>
</w:r>
<w:r
w:rsidR="00BA1D13">
<w:instrText>HYPERLINK "http://www.example.com/"</w:instrText>
</w:r>
<w:r
w:rsidR="00BA1D13">
<w:fldChar
w:fldCharType="separate"/>
</w:r>
<w:r
w:rsidRPr="00B233E5">
<w:t>HTTP</w:t>
</w:r>
<w:r
w:rsidR="00BA1D13">
<w:fldChar
w:fldCharType="end"/>
</w:r>
I have this for-each-group that seems to group correctly, but
I'm wondering if there's a simpler expression that does what I want:
<xsl:for-each-group select="w:r"
group-adjacent="
string(self::*[w:fldChar[(_at_)w:fldCharType = 'begin' or
@w:fldCharType = 'end']] or
(self::*[preceding-sibling::*/w:fldChar[(_at_)w:fldCharType =
'begin']] and
self::*[following-sibling::*/w:fldChar[(_at_)w:fldCharType =
'end']] and
count((self::*[preceding-sibling::*/w:fldChar[(_at_)w:fldCharType
=
'begin']])[1]/(*[following-sibling::*/w:fldChar[(_at_)w:fldCharType
= 'end']])[1]
|
(self::*[following-sibling::*/w:fldChar[(_at_)w:fldCharType =
'end']])[1]) = 1
))
"
In prose (at least this is what I intend the above expression
to mean): if w:r has child w:fldChar where @w:fldCharType =
'begin' or 'end' or w:r has both a preceding sibling w:r with
a w:fldChar of type 'begin' and a following sibling w:r with
a w:fldChar of type 'end' AND the nearest preceding sibling
field start has the same nearest following sibling field end
as the current node, then return the grouping "true" else
return the grouping key "false".
Whew.
I can't think of a simpler way to say this. Is there one?
I realize I could factor some of the complexity of the
expression out into a function or two, which I will probably do.
Thanks,
Eliot
----
Eliot Kimber | Senior Solutions Architect | Really Strategies, Inc.
email: ekimber(_at_)reallysi(_dot_)com
<mailto:ekimber(_at_)reallysi(_dot_)com>
office: 610.631.6770 | cell: 512.554.9368 2570 Boulevard of
the Generals | Suite 213 | Audubon, PA 19403 www.reallysi.com
<http://www.reallysi.com> | http://blog.reallysi.com
<http://blog.reallysi.com> | www.rsuitecms.com
<http://www.rsuitecms.com>
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail:
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--