That's what I needed: parse-xml-fragment(). This seems to work:
<xsl:template match="text:p" exclude-result-prefixes="#all">
<!-- <xsl:variable name="unparsed" select="concat('<p>',
string-join(//text(), ''), '</p>')"/>
<xsl:variable name="parsed" select="saxon:parse($unparsed)"/>
<xsl:copy-of select="$parsed" exclude-result-prefixes="#all"/>-->
<xsl:if test="string-length(.) gt 0">
<tei:p>
<xsl:value-of
select="parse-xml-fragment(string-join(//text(), ''))"/>
</tei:p></xsl:if>
</xsl:template>
for most cases. I do have some horrible edge-cases though:
<text:p>a start-tag, with delimiters < and > is intended</text:p>
I should be able to pre-process the input text for angle brackets in the
context of spaces and swap them out for something else temporarily though.
Thanks,
Martin
On 14-03-28 11:35 AM, Martin Honnen wrote:
Martin Holmes wrote:
I'm trying to process an ODS spreadsheet which has <text:p> nodes which
contain embedded mixed-content markup in escaped form:
<text:p>indicates the amount by which this zone has been rotated
clockwise, with respect to the normal orientation of the parent
<gi>surface</gi> element as implied by the dimensions given
in the <gi>msDesc</gi> element or by the coordinates of the
<gi>surface</gi> itself. The orientation is expressed in arc
degrees.</text:p>
I need to turn this back into parsed XML for insertion into XML
documents. I'm using Saxon 9.4 with XSLT 2 (and I can use 3 if
necessary).
I tried
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:text="http://example.com"
xmlns:tei="http://example.com/tei"
version="3.0">
<xsl:template match="text:p">
<tei:p>
<xsl:copy-of select="parse-xml-fragment(.)"/>
</tei:p>
</xsl:template>
</xsl:stylesheet>
with Saxon 9.5 PE and got
<?xml version="1.0" encoding="UTF-8"?><tei:p
xmlns:text="http://example.com" xmlns:tei="http://example.com/tei">indicate
s the amount by which this zone has been rotated clockwise, with respect
to the normal orientation of the parent <gi>sur
face</gi> element as implied by the dimensions given in the
<gi>msDesc</gi> element or by the coordinates of the <gi>sur
face</gi> itself. The orientation is expressed in arc degrees.</tei:p>
That has XML elements and not escaped markup so should do, you will need
to change the namespaces and maybe use exclude-result-prefixes.
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--