I spoke too soon. Passing this:
contains a single TEI-conformant document, comprising a TEI header and a
text, either in isolation or as part of a <gi>teiCorpus</gi>
element.
into parse-xml-fragment() gets this fatal error:
FODC0006: First argument to parse-xml-fragment() is not a well-formed
and namespace-well-formed XML fragment. XML parser reported: I/O error
reported by XML parser processing
file:/home/mholmes/Documents/tei/council/translation/new_translations_into_specs.xsl:
404 Not Found for: http://www.saxonica.com/parse-xml-fragment/actual.xml
This is with Saxon 9.1.5.3 PE.
I must be missing something here. The default namespace is tei, the
xpath-default-namespace is tei, and all the other namespaces have
defined prefixes (tei has tei: too).
Cheers,
Martin
On 14-03-28 12:09 PM, Martin Holmes wrote:
That's what I needed: parse-xml-fragment(). This seems to work:
<xsl:template match="text:p" exclude-result-prefixes="#all">
<!-- <xsl:variable name="unparsed" select="concat('<p>',
string-join(//text(), ''), '</p>')"/>
<xsl:variable name="parsed" select="saxon:parse($unparsed)"/>
<xsl:copy-of select="$parsed" exclude-result-prefixes="#all"/>-->
<xsl:if test="string-length(.) gt 0">
<tei:p>
<xsl:value-of
select="parse-xml-fragment(string-join(//text(), ''))"/>
</tei:p></xsl:if>
</xsl:template>
for most cases. I do have some horrible edge-cases though:
<text:p>a start-tag, with delimiters < and > is intended</text:p>
I should be able to pre-process the input text for angle brackets in the
context of spaces and swap them out for something else temporarily though.
Thanks,
Martin
On 14-03-28 11:35 AM, Martin Honnen wrote:
Martin Holmes wrote:
I'm trying to process an ODS spreadsheet which has <text:p> nodes which
contain embedded mixed-content markup in escaped form:
<text:p>indicates the amount by which this zone has been rotated
clockwise, with respect to the normal orientation of the parent
<gi>surface</gi> element as implied by the dimensions given
in the <gi>msDesc</gi> element or by the coordinates of the
<gi>surface</gi> itself. The orientation is expressed in arc
degrees.</text:p>
I need to turn this back into parsed XML for insertion into XML
documents. I'm using Saxon 9.4 with XSLT 2 (and I can use 3 if
necessary).
I tried
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:text="http://example.com"
xmlns:tei="http://example.com/tei"
version="3.0">
<xsl:template match="text:p">
<tei:p>
<xsl:copy-of select="parse-xml-fragment(.)"/>
</tei:p>
</xsl:template>
</xsl:stylesheet>
with Saxon 9.5 PE and got
<?xml version="1.0" encoding="UTF-8"?><tei:p
xmlns:text="http://example.com"
xmlns:tei="http://example.com/tei">indicate
s the amount by which this zone has been rotated clockwise, with respect
to the normal orientation of the parent <gi>sur
face</gi> element as implied by the dimensions given in the
<gi>msDesc</gi> element or by the coordinates of the <gi>sur
face</gi> itself. The orientation is expressed in arc degrees.</tei:p>
That has XML elements and not escaped markup so should do, you will need
to change the namespaces and maybe use exclude-result-prefixes.
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--