xsl-list
[Top] [All Lists]

Re: [xsl] Problem retaining HTML tags during XSL transformation

2007-01-10 12:43:37
Abel Braaksma wrote:
Ambika(_dot_)Das(_at_)iflexsolutions(_dot_)com wrote:
Hi All,
I am facing some problem in retaining the HTML tags during XSL transformation.
Given below are the details.

May I summarize that as follows? The input is the output with &#10; replaced by <br /> elements and the whole content inside two quotes (with some elements stripped, but not their content)?


Hi Ambika,

Since you appear to do text node matching, I thought to try something new (well, in XSLT 2.0 it is not so new, but I haven't seen it myself before in XSLT 1).

I call it "stringized template matching" (where you can apply template rules for ordinary strings from substring-before() etc.), to prevent complex recursive call-template calls. It is very easy to implement and it is applicable to almost all replace-string situations. There's one drawback: it only works with the extension function fn:node-set(). However, there are only few processors that have not implemented it for XSLT 1. But there's also a big advantage: it becomes tremendously easier to read.

I know, I know, it was not your problem: you already have your replace-string function in place (replacing newlines with <br>). But, if I am right about your question (confirm, please), you can solve it very simply by using the following solution (creates your requested output). I took the liberty to add an additional rule for removing whitespace-only nodes. Not sure you want that.

It is easy enough to add matches for replacing double quotes, too. If it gets more complex, I advice to use modes for the more generic matches, so as to not make it too complex.

Happy coding!

Cheers,
-- Abel Braaksma
  http://www.nuntia.nl

On the following XML:
<elem id="11" date="10 Jan 2007" time="16:55">
   Non Title
   <title>Title Here</title>
   <text>
       <p>without newline</p>
       <PRE> (Start of pre tag)
           <p>
               Within P
           </p>
           After P tag
           (End of pre tag)
       </PRE>
   </text>
</elem>

It creates (second line one line):

Elem id, New details
11,"<br>Non Title<br><title>Title Here</title><p>without newline</p><PRE>(Start of pre tag)<br><p><br>Within P<br></p><br>After P tag<br>(End of pre tag)<br></PRE>"



Here's the XSLT that does it all:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
   xmlns:exslt="http://exslt.org/common";
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
<xsl:output method="text" indent="yes"/> <xsl:template match="/">
       <xsl:text>Elem id, New details&#10;</xsl:text>
       <xsl:apply-templates />
   </xsl:template>
<xsl:template match="elem">
       <xsl:value-of select="@id"/>
       <xsl:text>,"</xsl:text>
       <xsl:apply-templates select="node()" />
       <xsl:text>"&#10;</xsl:text>
   </xsl:template>
<!-- general rule: keep these nodes -->
   <xsl:template match="*">
       <xsl:value-of select="concat('&lt;', name(), '&gt;')"/>
       <xsl:apply-templates select="node()" />
       <xsl:value-of select="concat('&lt;/', name(), '&gt;')"/>
   </xsl:template>
<!-- specific rule: 'throw away' these nodes, but keep content -->
   <xsl:template match="text">
       <xsl:apply-templates select="node()" />
   </xsl:template>
<!-- throws away whitespace only text nodes -->
   <xsl:template match="text()[normalize-space(.) = '']" />
<!-- matches text nodes with newline --> <xsl:template match="text()[contains(., '&#10;')][not(normalize-space(.) = '')]">
       <xsl:variable name="before">
           <xsl:value-of select="substring-before(., '&#10;')" />
       </xsl:variable>
       <xsl:variable name="after">
           <xsl:value-of select="substring-after(., '&#10;')" />
       </xsl:variable>
<xsl:apply-templates select="exslt:node-set($before)/node()" />
       <xsl:text>&lt;br&gt;</xsl:text>
       <xsl:apply-templates select="exslt:node-set($after)/node()" />
   </xsl:template>
<!-- matches text nodes without newline -->
   <xsl:template match="text()[not(contains(., '&#10;'))]">
       <xsl:value-of select="normalize-space(.)"/>
   </xsl:template>

</xsl:stylesheet>

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--