I haven't looked through your code in detail, but it looks similar to a
problem I used as an exercise at the Oxford Summer School. Here we had a set
of records with COBOL-like level numbers
<A level="1"/>
<B level="2"/>
<C level="3"/>
<D level="2"/>
and the task is to create a hierarchically nested structure. (The actual
input was a GEDCOM file).
the solution is a recursive grouping like this:
<xsl:template name="g">
<xsl:param name="sequence" as="element()*"/>
<xsl:param name="level" as="xs:integer"/>
<xsl:for-each-group select="$sequence"
group-starting-with="*[(_at_)level=$level]">
<xsl:copy>
<xsl:call-template name="g">
<xsl:with-param name="sequence" select="current-group() except ."/>
<xsl:with-param name="level" select="$level+1"/>
</
</
</
</
Now it seems to me your problem is very similar, except you have no explicit
level number. But I think you could use a similar approach, where the same
template is used for each level of grouping and the only thing that changes
is the grouping key.
Michael Kay
http://www.saxonica.com/
-----Original Message-----
From: Jim_Albright(_at_)wycliffe(_dot_)org
[mailto:Jim_Albright(_at_)wycliffe(_dot_)org]
Sent: 28 September 2004 13:07
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: RE: [xsl] Re: up-converting
I have a solution for the up-converting problem that I had.
It isn't as
elegant as I was hoping for. Maybe someone here can give me a
few more
pointers.
Thanks again for including the for each group structure as
that makes the
solution much easier.
My general problem is conversion of a flat XML (WordML)
document to one
with hierarchy.
After tossing out all of the formatting info,
The first step is to map all the paragraphs that indicate
divs to their
appropriate level. I use a, b, c, d, ... for new element
names in order to
make this more generic.
The a, b, c, indicate the head or title for the div.
The div may be nested: a, b, c, d.
Some divs may be omitted: a, b, d.
Divs may be followed by other divs or paragraphs. Paragraphs
may contain
spans.
Next use the for-each-group structure to put a aa element
around the a
elements.
Next use the for-each-group structure to put a bb element
around the b
elements and aaa instead of aa.
...
Each of these steps builds the required hierarchy one step at a time.
Since some divs may be omitted I couldn't find a way to combine these
steps.
Next the head/title is pulled out.
Toss out any div with no head/title
sample input
<?xml version="1.0" encoding="UTF-8"?>
<document>
<a>level aaaa head 1</a>
<b>level bbbb head 2</b>
<c>level ccccc head 3</c>
<dfg>cc 4 blah</dfg>
<e>level eeee head 5 </e>
<fhh>cc blah 6</fhh>
<c>level ccccc head 7</c>
<df>cc 8 blah<kkk>kkk within df within c</kkk>
</df>
<d>level dddd head 9</d>
<iuo>dd 10 blah</iuo>
<jtt>dd blah 11</jtt>
<c>level ccccc head 12</c>
<df>cc 13 blah</df>
<e>cc level eeeee head 14</e>
<fss>ee blah 15</fss>
<b>level bbbbb head 16</b>
<c>level ccccc head 17</c>
<df>cc 18 blah</df>
<e>cc level eeeee head 19</e>
<fhy>ee blah 20</fhy>
</document>
and the required output is
<?xml version="1.0" encoding="UTF-8"?>
<document>
<div-a>
<title>level aaaa head 1</title>
<div-b>
<title>level bbbb head 2</title>
<div-c>
<title>level ccccc head 3</title>
<dfg>cc 4 blah</dfg>
<div-e>
<title>level eeee head 5 </title>
<fhh>cc blah 6</fhh>
</div-e>
</div-c>
<div-c>
<title>level ccccc head 7</title>
<df>cc 8 blah<kkk>kkk within df within c</kkk>
</df>
<div-d>
<title>level dddd head 9</title>
<iuo>dd 10 blah</iuo>
<jtt>dd blah 11</jtt>
</div-d>
</div-c>
<div-c>
<title>level ccccc head 12</title>
<df>cc 13 blah</df>
<div-e>
<title>cc level eeeee head 14</title>
<fss>ee blah 15</fss>
</div-e>
</div-c>
</div-b>
<div-b>
<title>level bbbbb head 16</title>
<div-c>
<title>level ccccc head 17</title>
<df>cc 18 blah</df>
<div-e>
<title>cc level eeeee head 19</title>
<fhy>ee blah 20</fhy>
</div-e>
</div-c>
</div-b>
</div-a>
</document>
Next use the for-each-group structure to put a aa element
around the a
elements.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0"
encoding="UTF-8" indent="yes"/>
<xsl:template match="document">
<document>
<xsl:for-each-group select="*"
group-starting-with="a">
<aa>
<xsl:for-each
select="current-group()">
<xsl:copy-of
select="."/>
</xsl:for-each>
</aa>
</xsl:for-each-group>
</document>
</xsl:template>
<xsl:template match="@*|node()" name="copy-current-node">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
with output of
<?xml version="1.0" encoding="UTF-8"?>
<document>
<aa>
<a>level aaaa head 1</a>
<b>level bbbb head 2</b>
<c>level ccccc head 3</c>
<dfg>cc 4 blah</dfg>
<e>level eeee head 5 </e>
<fhh>cc blah 6</fhh>
<c>level ccccc head 7</c>
<df>cc 8 blah<kkk>kkk within df within c</kkk>
</df>
<d>level dddd head 9</d>
<iuo>dd 10 blah</iuo>
<jtt>dd blah 11</jtt>
<c>level ccccc head 12</c>
<df>cc 13 blah</df>
<e>cc level eeeee head 14</e>
<fss>ee blah 15</fss>
<b>level bbbbb head 16</b>
<c>level ccccc head 17</c>
<df>cc 18 blah</df>
<e>cc level eeeee head 19</e>
<fhy>ee blah 20</fhy>
</aa>
</document>
Next use the for-each-group structure to put a bb element
around the b
elements and aaa instead of aa.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0"
encoding="UTF-8" indent="yes"/>
<xsl:template match="document">
<document>
<xsl:apply-templates/>
</document>
</xsl:template>
<xsl:template match="aa">
<aaa>
<xsl:for-each-group select="*"
group-starting-with="b">
<bb>
<xsl:for-each
select="current-group()">
<xsl:copy-of
select="."/>
</xsl:for-each>
</bb>
</xsl:for-each-group>
</aaa>
</xsl:template>
<xsl:template match="@*|node()" name="copy-current-node">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
<?xml version="1.0" encoding="UTF-8"?>
<document>
<aaa>
<bb>
<a>level aaaa head 1</a>
</bb>
<bb>
<b>level bbbb head 2</b>
<c>level ccccc head 3</c>
<dfg>cc 4 blah</dfg>
<e>level eeee head 5 </e>
<fhh>cc blah 6</fhh>
<c>level ccccc head 7</c>
<df>cc 8 blah<kkk>kkk within df within c</kkk>
</df>
<d>level dddd head 9</d>
<iuo>dd 10 blah</iuo>
<jtt>dd blah 11</jtt>
<c>level ccccc head 12</c>
<df>cc 13 blah</df>
<e>cc level eeeee head 14</e>
<fss>ee blah 15</fss>
</bb>
<bb>
<b>level bbbbb head 16</b>
<c>level ccccc head 17</c>
<df>cc 18 blah</df>
<e>cc level eeeee head 19</e>
<fhy>ee blah 20</fhy>
</bb>
</aaa>
</document>
continue adding the levels
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0"
encoding="UTF-8" indent="yes"/>
<xsl:template match="document">
<document>
<xsl:apply-templates/>
</document>
</xsl:template>
<xsl:template match="aaa">
<aaa>
<xsl:apply-templates/>
</aaa>
</xsl:template>
<xsl:template match="bb">
<bbb>
<xsl:for-each-group select="*"
group-starting-with="c">
<cc>
<xsl:for-each
select="current-group()">
<xsl:copy-of
select="."/>
</xsl:for-each>
</cc>
</xsl:for-each-group>
</bbb>
</xsl:template>
<xsl:template match="@*|node()" name="copy-current-node">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
<?xml version="1.0" encoding="UTF-8"?>
<document>
<aaa>
<bbb>
<cc>
<a>level aaaa head 1</a>
</cc>
</bbb>
<bbb>
<cc>
<b>level bbbb head 2</b>
</cc>
<cc>
<c>level ccccc head 3</c>
<dfg>cc 4 blah</dfg>
<e>level eeee head 5 </e>
<fhh>cc blah 6</fhh>
</cc>
<cc>
<c>level ccccc head 7</c>
<df>cc 8 blah<kkk>kkk within df within c</kkk>
</df>
<d>level dddd head 9</d>
<iuo>dd 10 blah</iuo>
<jtt>dd blah 11</jtt>
</cc>
<cc>
<c>level ccccc head 12</c>
<df>cc 13 blah</df>
<e>cc level eeeee head 14</e>
<fss>ee blah 15</fss>
</cc>
</bbb>
<bbb>
<cc>
<b>level bbbbb head 16</b>
</cc>
<cc>
<c>level ccccc head 17</c>
<df>cc 18 blah</df>
<e>cc level eeeee head 19</e>
<fhy>ee blah 20</fhy>
</cc>
</bbb>
</aaa>
</document>
....
finally at aaa see if there is a descendant a, if so that is
the title for
this group, otherwise no title
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8"
indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="document">
<document>
<xsl:apply-templates/>
</document>
</xsl:template>
<xsl:template match="aaa">
<div-a>
<xsl:choose>
<xsl:when test="descendant::a">
<title>
<xsl:apply-templates
select="descendant::a"/>
</title>
<xsl:apply-templates
select="child::*"/>
</xsl:when>
<xsl:otherwise>
<xsl:apply-templates
select="child::*"/>
</xsl:otherwise>
</xsl:choose>
</div-a>
</xsl:template>
<xsl:template match="bbb">
<xsl:choose>
<xsl:when test="descendant::b">
<div-b>
<title>
<xsl:apply-templates
select="descendant::b"/>
</title>
<xsl:apply-templates
select="child::*"/>
</div-b>
</xsl:when>
<xsl:otherwise>
<xsl:apply-templates
select="child::*"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template match="ccc">
<xsl:choose>
<xsl:when test="descendant::c">
<div-c>
<title>
<xsl:apply-templates
select="descendant::c"/>
</title>
<xsl:apply-templates
select="descendant::*[preceding-sibling::c]"/>
<xsl:apply-templates
select="child::*"/>
</div-c>
</xsl:when>
<xsl:otherwise>
<xsl:apply-templates
select="child::*[not(c)]|descendant::*[preceding-sibling::c]"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template match="ddd">
<xsl:choose>
<xsl:when test="descendant::d">
<div-d>
<title>
<xsl:apply-templates
select="descendant::d"/>
</title>
<xsl:apply-templates
select="descendant::*[preceding-sibling::d]"/>
<xsl:apply-templates
select="child::*"/>
</div-d>
</xsl:when>
<xsl:otherwise>
<xsl:apply-templates
select="child::*|descendant::*[preceding-sibling::d]"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template match="eee">
<xsl:choose>
<xsl:when test="descendant::e">
<div-e>
<title>
<xsl:apply-templates
select="descendant::e"/>
</title>
<xsl:apply-templates
select="descendant::*[preceding-sibling::e]"/>
<xsl:apply-templates
select="child::*"/>
</div-e>
</xsl:when>
<xsl:otherwise>
<xsl:apply-templates
select="child::*|descendant::*[preceding-sibling::e]"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template match="a|b|c|d|e|f|g|h|i">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="@*|node()" >
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
and then we can get rid of divs that have no title. Thus solving the
missing div problem.
Jim Albright
704 843-0582
Wycliffe Bible Translators
--+------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail:
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--+--