On 29/01/2010 11:02, bw wrote:
Hello,
I have a big xml feed out of my content management system that
includes wysiwyg html tags inside CDATA tags.
I am looking for a way to remove the CDATA and only get the text.
CURRENT:
<add>
<doc>
<some_title>My title</some_title>
<content><![CDATA[
<p>The<strong>keyword</strong> is nice to have but is not needed to
include in a solr feed</p><p><table cellspacing="2" cellpadding="2"
border="1" width="100%"><tbody><tr><td>Étape 1 :</td></tr>
]]></content>
</doc>
<doc>
....
</doc>
</add>
WANTED:
<add>
<doc>
<some_title>My title</some_title>
<content>The keyword is nice to have but is not needed to
include in a solr feed</content>
</doc>
<doc>
....
</doc>
</add>
Cheers
XSLT has no access to any tags in the input file, they are all resolved
by an XML parser before XSLT sees the input.
So your input is
<content>
<p>The<strong>keyword</strong> is nice to have but is not
needed to
include in a solr feed</p><p><table cellspacing="2"
cellpadding="2"
border="1" width="100%"><tbody><tr><td>Étape
1 :</td></tr>
</content>
The best way to get from such a string to an XML element tree is to
parse the string. saxon and some other systems havve extensions to do that
<xsl:copy-of seelct="saxon:parse(content)"/>
for example.
Otherwise as a deprecated and non portable alternative you may be able
to get away with
<xsl:value-of disable-output-escaping="yes" select="content"/>
which doesn't create the element nodes, but just makes the appearance of
them in the serialised reult.
david
________________________________________________________________________
The Numerical Algorithms Group Ltd is a company registered in England
and Wales with company number 1249803. The registered office is:
Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom.
This e-mail has been scanned for all viruses by Star. The service is
powered by MessageLabs.
________________________________________________________________________
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--