xsl-list
[Top] [All Lists]

[xsl] grouping xhtml title with first sibling

2011-01-18 05:55:36
Hi All,

This is grouping problem, I'm not sure I'm going to the good direction, any help would be welcome.

My input file is an XHTML one, it contains title elements h1, h2, h3.
(Let's say they are all at under the <body>, or more generally they all are adjacent)

I'd like to group each of them with his first-sibling element, with those restrictions : - if the first sibling is a title element (h1, h2 or h3), then continue and group with the next sibling. - if the next sibling element is higher level (h1 > h2 > h3) then stop grouping and start a new group from this next sibling - if the next sibling has @class='foo', then don't perform the grouping (I actually like to perform any xpath test here)

This is my unit-test sample
<?xml version="1.0" encoding="UTF-8"?>
<html xmlns="http://www.w3.org/1999/xhtml";>
<head>
<title>unit-test</title>
</head>
<body>
<h1>title1</h1>
<p>para1</p>
<hr/>
<div><img src="img1.jpg" alt=""/></div>
<table><tr><td>table1</td></tr></table>
<h2>title2</h2>
<p>para2</p>
<h3>title3</h3>
<p>para3</p>
<p>para4</p>
<h2>title4</h2>
<h3>title5</h3>
<p>para5</p>
<table><tr><td>table2</td></tr></table>
<h2>title6</h2>
<h1>title7</h1>
<p>para6</p>
<p>para7</p>
<h2>title8</h2>
<p class="foo">para8</p>
<p>para9</p>
<h1>title9</h1>
<h2>title10</h2>
<h3>title11</h3>
<p>para10</p>
</body>
</html>

Desired ouput is :
<?xml version="1.0" encoding="UTF-8"?>
<html xmlns="http://www.w3.org/1999/xhtml";>
<head>
<title>unit-test</title>
</head>
<body>
<div class="group">
<h1>title1</h1>
<p>para1</p>
</div>
<hr/>
<div><img src="img1.jpg" alt=""/></div>
<table><tr><td>table1</td></tr></table>
<div class="group">
<h2>title2</h2>
<p>para2</p>
</div>
<div class="group">
<h3>title3</h3>
<p>para3</p>
</div>
<p>para4</p>
<div class="group">
<h2>title4</h2>
<h3>title5</h3>
<p>para5</p>
</div>
<table><tr><td>table2</td></tr></table>
<h2>title6</h2>
<div class="group">
<h1>title7</h1>
<p>para6</p>
</div>
<p>para7</p>
<h2>title8</h2>
<p class="foo">para8</p>
<p>para9</p>
<div class="group">
<h1>title9</h1>
<h2>title10</h2>
<h3>title11</h3>
<p>para10</p>
</div>
</body>
</html>

I actually like this to work with any h1, h2, ..., h6.
In this purpose I gave a param to my xslt :
<xsl:param name="elements" select="'h1,h2,h3'"/>

That means I need the eval() function after this (I'm using saxon9 for this)
If my memory is good, I think I gave a try with grouping-adjacent but it didn't work, so I move to another method.

This is my XSLT :
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"; xmlns:saxon="http://saxon.sf.net/"; xpath-default-namespace="http://www.w3.org/1999/xhtml"; xmlns:h="http://www.w3.org/1999/xhtml";>
<xsl:output method="xml" indent="yes"/>

<xsl:param name="debug" select="'no'"/>
<xsl:param name="verbose" select="'yes'"/>
<xsl:param name="elements" select="'h1,h2,h3'"/>

<xsl:variable name="direct-concerned">
<xsl:for-each select="tokenize($elements,',')">
<xsl:text>self::</xsl:text>
<xsl:value-of select="."/>
<xsl:if test="not(position()=last())">
<xsl:text> or </xsl:text>
</xsl:if>
</xsl:for-each>
</xsl:variable>

<xsl:variable name="not-following-sibling">
<xsl:text>not(</xsl:text>
<xsl:for-each select="tokenize($elements,',')">
<xsl:text>name(following-sibling::*[1])='</xsl:text>
<xsl:value-of select="."/>
<xsl:text>'</xsl:text>
<xsl:if test="not(position()=last())">
<xsl:text> or </xsl:text>
</xsl:if>
</xsl:for-each>
<xsl:text>)</xsl:text>
</xsl:variable>

<xsl:variable name="concerned" select="concat( $direct-concerned, ' and ', $not-following-sibling )"/>

<xsl:variable name="uncopy">
<xsl:for-each select="tokenize($elements,',')">
<xsl:text>preceding-sibling::*[1][self::</xsl:text>
<xsl:value-of select="."/>
<xsl:text>]</xsl:text>
<xsl:if test="not(position()=last())">
<xsl:text> or </xsl:text>
</xsl:if>
</xsl:for-each>
</xsl:variable>

<xsl:template match="/">
<xsl:if test="$verbose='yes'">
<xsl:message>direct-concerned test=<xsl:value-of select="$direct-concerned"/></xsl:message> <xsl:message>concerned= test<xsl:value-of select="$concerned"/></xsl:message>
<xsl:message>uncopy= test<xsl:value-of select="$uncopy"/></xsl:message>
</xsl:if>
<xsl:if test="$debug!='yes'">
<xsl:apply-templates/>
</xsl:if>
</xsl:template>

<xsl:template match="* | node() | @*" >
<xsl:param name="copy" select="false()"/>
<xsl:choose>
<xsl:when test="$copy">
<xsl:copy>
<xsl:apply-templates select="* | node() | @*" />
</xsl:copy>
</xsl:when>
<xsl:when test="saxon:evaluate($uncopy)"/>
<xsl:when test="saxon:evaluate($direct-concerned)">
<xsl:element name="div" namespace="http://www.w3.org/1999/xhtml";>
<xsl:attribute name="class">group</xsl:attribute>
<xsl:apply-templates select="self::* | following-sibling::*[1]">
<xsl:with-param name="copy" select="true()"/>
</xsl:apply-templates>
</xsl:element>
</xsl:when>
<xsl:otherwise>
<xsl:copy>
<xsl:apply-templates select="* | node() | @*" />
</xsl:copy>
</xsl:otherwise>
</xsl:choose>
</xsl:template>

</xsl:stylesheet>

I don't get a good result : some para disapears, and the grouping doesn't follow excatly the rules above. Before continuing debuging this, I'd like to know if there is not another way to see the problem ?

Thanks in advance for your light,

Matthieu.

PS : To tell you everything about this project : I would like at best that the concerned title elements list can be set somewhere, within such a node set variable for example :
<xsl:variable>
<title match="h1[@class='bar']" level="1"/>
<title match="h1[span]" level="1"/>
<title match="h2" level="2"/>
<title match="h3[not(@class)]" level="3"/>
</xsl:variable>
But well I know... it's a greedy demand. Maybe one day.



--
Matthieu Ricaud
IGS-CP
Service Livre numérique



--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--

<Prev in Thread] Current Thread [Next in Thread>