xsl-list
[Top] [All Lists]

[xsl] Turning escaped mixed content back to XML

2014-03-28 13:13:02
Hi there,

I'm trying to process an ODS spreadsheet which has <text:p> nodes which contain embedded mixed-content markup in escaped form:

<text:p>indicates the amount by which this zone has been rotated clockwise, with respect to the normal orientation of the parent &lt;gi&gt;surface&lt;/gi&gt; element as implied by the dimensions given in the &lt;gi&gt;msDesc&lt;/gi&gt; element or by the coordinates of the &lt;gi&gt;surface&lt;/gi&gt; itself. The orientation is expressed in arc degrees.</text:p>

I need to turn this back into parsed XML for insertion into XML documents. I'm using Saxon 9.4 with XSLT 2 (and I can use 3 if necessary). I've tried a variety of approaches involving saxon:serialize with disable-output-escaping, feeding into saxon:parse, but the output always ends up being escaped just like the input. Does anyone have experience of doing this?

Here's the sort of thing I've tried:

[...]
<xsl:output name="outputSerializedTEI" method="xml" indent="no" encoding="UTF-8" exclude-result-prefixes="#all" omit-xml-declaration="yes" />

[...]

    <xsl:template match="text:p" exclude-result-prefixes="#all">
        <xsl:variable name="unparsed">
            <xsl:copy-of select="*|text()"/>
        </xsl:variable>
<xsl:variable name="parsed" select="saxon:parse(saxon:serialize($unparsed, 'outputSerializedTEI'))"/>
            <tei:p>
                <xsl:copy-of select="$parsed"/>
            </tei:p>
    </xsl:template>

Cheers,
Martin


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--