xsl-list
[Top] [All Lists]

Re: [xsl] Move elements to preceding parent

2009-06-15 09:00:16
Hi Martin,
Thank you for this. It looks very elegant.
Can you please explain the idea of the line:
 <xsl:template match="p[preceding-sibling::p[1][span[(_at_)class ne 'chapter']
and not(matches(span[(_at_)class ne 'chapter'][last()], '[.?&quot;!]$'))]]"/>

Does it remove the p  that has preceding sibling with no ending
character at the end of the last span?


I tried it with a more complete example like the following:


<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd";>
<html xmlns="http://www.w3.org/1999/xhtml"; xml:lang="en">
<head>
   <meta http-equiv="Content-Type" content="application/xhtml+xml;
charset=utf-8"/>
   <title/>
   <link href="test.css" rel="stylesheet" type="text/css"/>
</head>
<body>
   <p dir="rtl">
      <span class="chapter">line1</span>
   </p>
   <p dir="rtl">&nbsp;&nbsp;<br />
   <span class="regular">line3.</span>
   <span class="italic">line4</span>
   <span class="regular">line5."</span>
   </p>
   <p dir="rtl">&nbsp;&nbsp;<br />
   <span class="regular">line6.</span>
   <br />
   <span class="regular">line7</span>
 </p>
 <p dir="rtl">&nbsp;&nbsp;<br />
   <span class="regular">line8.</span>
   <span class="regular">line9.</span>
 </p>
</body>
</html>


The output was:

<?xml version="1.0" encoding="UTF-8"?><html
xmlns="http://www.w3.org/1999/xhtml";
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; xml:lang="en"
version="-//W3C//DTD XHTML 1.1//EN">
   <head profile="">
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
      <title></title>
      <link href="test.css" rel="stylesheet" type="text/css"
xml:space="preserve" />
   </head>
   <body xml:space="preserve">
      <p dir="rtl" xml:space="preserve">
         <span class="chapter" xml:space="preserve">line1</span>

      </p>
      <p dir="rtl" xml:space="preserve">  <br xml:space="preserve" />
         <span class="regular" xml:space="preserve">line3.</span>
         <span class="italic" xml:space="preserve">line4</span>
         <span class="regular" xml:space="preserve">line5."</span>

      </p>
      <p dir="rtl" xml:space="preserve">  <br xml:space="preserve" />
         <span class="regular" xml:space="preserve">line6.</span>
         <br xml:space="preserve" />
         <span class="regular" xml:space="preserve">line7</span>
           <br xml:space="preserve" />
         <span class="regular" xml:space="preserve">line8.</span>
         <span class="regular" xml:space="preserve">line9.</span>

      </p>
   </body>
</html>


How can I remove the following:
1. extra xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; and
version="-//W3C//DTD XHTML 1.1//EN" inside html element.
2. extra profile="" in head element
3. extra xml:space="preserve" in p, span and br elements.

Thanks, Viente

On Sun, Jun 14, 2009 at 6:50 PM, Martin 
Honnen<Martin(_dot_)Honnen(_at_)gmx(_dot_)de> wrote:
Israel Viente wrote:

My input is something like the following:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd";>
<html xmlns="http://www.w3.org/1999/xhtml";>
<body>
  <p dir="rtl">
     <span class="chapter">line1</span>
  </p>
  <p dir="rtl">&nbsp;&nbsp;<br />
  <span class="regular">line3.</span>
  <span class="italic">line4</span>
  <span class="regular">line5."</span>
  </p>
  <p dir="rtl">&nbsp;&nbsp;<br />
  <span class="regular">line6.</span>
  <br />
  <span class="regular">line7</span>
 </p>
 <p dir="rtl">&nbsp;&nbsp;<br />
  <span class="regular">line8.</span>
  <span class="regular">line9.</span>
 </p>
</body>
</html>


The reault output should be:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd";>
<html xmlns="http://www.w3.org/1999/xhtml";>
<body>
  <p dir="rtl">
     <span class="chapter">line1</span>
  </p>
  <p dir="rtl">&nbsp;&nbsp;<br />
         <span class="regular">line3.</span>
         <span class="italic">line4</span>
         <span class="regular">line5."</span>
  </p>
  <p dir="rtl">&nbsp;&nbsp;<br />
         <span class="regular">line6.</span>
         <br />
         <span class="regular">line7</span>
         <span class="regular">line8.</span>
         <span class="regular">line9.</span>
  </p>
</body>
</html>

For every span element that the class<>'chapter' verify that in every
p the last span element text ends with one character of .?"!
(paragraph ending char).
If it does, copy as is to the output.
Otherwise: Move the span elements from the next p to the current one
and remove the next p completely.

Here is an attempt at solving that with XSLT 2.0:

<xsl:stylesheet
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
 xpath-default-namespace="http://www.w3.org/1999/xhtml";
 version="2.0">

 <xsl:output method="xhtml"/>

 <xsl:template match="@* | node()">
   <xsl:copy>
     <xsl:apply-templates select="@* | node()"/>
   </xsl:copy>
 </xsl:template>

 <xsl:template match="p[span[(_at_)class ne 'chapter'] and
not(matches(span[(_at_)class ne 'chapter'][last()], '[.?&quot;!]$'))]">
   <xsl:copy>
     <xsl:apply-templates select="@* | node() |
following-sibling::p[1]/node()"/>
   </xsl:copy>
 </xsl:template>

 <xsl:template match="p[preceding-sibling::p[1][span[(_at_)class ne 'chapter']
and not(matches(span[(_at_)class ne 'chapter'][last()], '[.?&quot;!]$'))]]"/>

</xsl:stylesheet>

For the posted input using Saxon 9 it produces the described output but I
have not tested with other inputs.

--

       Martin Honnen
       http://msmvps.com/blogs/martin_honnen/

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--



--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--