xsl-list
[Top] [All Lists]

Re: [xsl] remove tags + CDATA tag out of big xml file

2010-02-01 08:52:31
Hi Michael,

This is exactly why I want to remove it ;-). I was even thinking about
some fancy perl script command to remove it now.

On 29/01/2010, Michael Ludwig <milu71(_at_)gmx(_dot_)de> wrote:
bw schrieb am 29.01.2010 um 12:02:10 (+0100):
Hello,

I have a big xml feed out of my content management system that
includes wysiwyg html tags inside CDATA tags.

I am looking for a way to remove the CDATA and only get the text.

         <content><![CDATA[
<p>The <strong>keyword</strong> is nice to have but is not needed to
include in a solr feed</p> ...

Looks like this feed is for Solr (an indexer), which won't do anything
useful with the markup anyway. Someone has defined <title> and <content>
as fields for the indexer but has forgotten to strip the markup from the
source. That source markup in CDATA has no purpose in a feed for Solr
and should not have been included in the first place.

--
Michael Ludwig

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--




-- 
[Bb](astia{2}n)?\s?[Ww](ak{2}ie)?$

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--

<Prev in Thread] Current Thread [Next in Thread>