What processor are you using? With xalan, for the following XML:
<data>
<Field outputName="TEXT">
2010 &quot;We
respectfully Wish the health of the great leader
[yo'ndude] Comarade Big John Il
</Field>
</data>
By the applying the following XSL:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:output method="xml" omit-xml-declaration="no"
indent="yes" cdata-section-elements="TEXT" />
<xsl:template match="Field">
<xsl:if test="contains ('TEXT', @OutputName)">
<xsl:element name="{(_at_)OutputName}">
<xsl:copy-of select="."/>
</xsl:element>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
I get this result:
<?xml version="1.0" encoding="UTF-8"?>
<Field outputName="TEXT">
2010 &quot;We
respectfully Wish the health of the great leader
[yo'ndude] Comarade Big John Il
</Field>
Regards,
--A
From: mylistaddress(_at_)canada(_dot_)com
Reply-To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: RE: [xsl] Multiple CDATA tags...again
Date: Mon, 09 May 2005 18:02:41 -0700 (PDT)
Hi,
Thanks for responding. I am pretty much ready to throw
myself off of a bridge...but I guess I can't complain
about learning on the job.
OK, here's the deal. I am sending XML requests via Java
1.4 to a library DB called STAR XML (made by Cuadra)
which sends back a very verbose XML response of a news
item. I have no control over the format of the output.
I was able to make sense out of it (thanks to your
responses) and transform it into a format more
acceptable to the Verity search indexing spider.
When the output from STAR XML is HTML, the < and > tags
are converted to < and > and so on. Oddly it
appears to also convert a quote as &quot; instead
of ". When I try to index the resulting XML
document without placing CDATA tags (not really a tag,
right?) around the content, the indexer fails.
The content also contains [ and ] and non english text.
So, I added the cdata-section-elements declaration to
my xsl:output and this is when I encountered the
multiple cdata tags. At first i suspected they appeared
wherever there is a line-break, but this does not
appear to be the case.
Here is a portion of the XML response from STAR XML:
<Field outputName="TEXT">
2010 &quot;We
respectfully Wish the health of the great leader
[yo'ndude] Comarade Big John Il
</Field>
Here is a portion of the XSL dealing with the TEXT
element:
<xsl:output method="xml" omit-xml-declaration="no"
indent="yes" cdata-section-elements="TEXT" />
<xsl:strip-space elements="*" />
...
<xsl:template match="Field">
<xsl:if test="contains ('TEXT', @OutputFieldName)">
<xsl:element name="{(_at_)OutputFieldName}">
<xsl:apply-templates/>
</xsl:if>
</xsl:template>
Resulting XML:
<TEXT>
<![CDATA[2010 "We
]]><![CDATA[ Respectfully Wish
Hea]]><![CDATA[lth of the great leader
]]><![CDATA[ [yo'ndude] Brother ]]><![CDATA[
Big John Il] ]]>
</TEXT>
As you can see, the CDATAs are appearing all over the
place. This is just a small clip. The actual doc has
dozens. Also notice how the " (no more &
before the quot;) appear now. Do I have to transform
them again? My literal [ and ] are intact.
I visited dpawson.co.uk and read up on the doe stuff,
but am still stuck. Could anyone recommend a book? XSLT
cookbook? I borrowed the O'reiley XML hack (and noticed
your name) but it is slim on xsl.
Thanks so much for any help.
JW
_________________________________________________________________
Don?t just search. Find. Check out the new MSN Search!
http://search.msn.click-url.com/go/onm00200636ave/direct/01/
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--