Hi Alex,
If I understand your question correctly, you want to transform an XML
document that contains a DOCTYPE declaration, but you don't want the
named entities to be replaced by the entity values? There are a few
things to consider:
1) XSLT does not preserve the doctype, but you can work around this by
reading the source document with unparsed-text() and adding it to the
result-tree with disable-output-escaping, or use a more tricky (but more
expandable) approach which involves xsl:character-maps. This as least
gives you the doctype declaration back
2) Your doctype declares entities for namespace names. These will be
automatically filled in by the XSLT processor in the correct locations.
If you want to shorten these declarations back to your named entity
references you'll be pushing it beyond the rules of what XSLT is
supposed to do: create valid XML plus namespaces. It's usually a bad
idea and the trick to fix this is so cumbersome (maybe someone knows a
better way) that I strongly advice against it. The simplest approach I
can think of is to grab the output XML and parse it again, now as
unparsed-text, and use a regular expression to replace the namespaces.
Use text as (second) output format.
This is a scenario that's not well supported by XSLT, simply because you
try to reverse something that's already removed by the XML parser
(before it even gets to the XSLT processor).
3) From an XML point of view, there's no difference in your document
with or without the doctype and named entities, provided the named
entities are replaced with the entity values. If your document is
supposed to be machine-processed, then there really is not need going to
great length trying to bypass the XML standards.
4) An alternative solution: again, use character-maps, but create the
XSLT based on your input (or if the doctype doesn't change, do it only
once). The character-maps will replace the characters in the output.
Solution (1) above can be found with an example a couple of years back
in this list, but I can't remember when exactly. I'm not entirely sure
if solution (4) works without namespace declarations, because
technically, they aren't attributes. But I think they ought to be
replaced as well.
But, as others on this list may suggest as well: consider not doing this
at all. It's messy, a lot of work, and it doesn't improve your output
from an XML point of few.
Kind regards,
Abel Braaksma
On 6-3-2011 13:27, Alex Muir wrote:
Hi,
What do I need to add to an xslt 2.0 stylesheet that modifies an RDF
file which has a doctype declaration with entity references. I'm not
certain how to preserve the DOCTYPE here exactly as shown and also
preserve the entity references such&wiki; within the document.
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE rdf:RDF[
<!ENTITY rdf 'http://www.w3.org/1999/02/22-rdf-syntax-ns#'>
<!ENTITY rdfs 'http://www.w3.org/2000/01/rdf-schema#'>
<!ENTITY owl 'http://www.w3.org/2002/07/owl#'>
<!ENTITY swivt 'http://semantic-mediawiki.org/swivt/1.0#'>
<!ENTITY wiki 'http://p13.itawiki.org/wiki/Special:URIResolver/'>
<!ENTITY property
'http://p13.itawiki.org/wiki/Special:URIResolver/Property-3A'>
<!ENTITY wikiurl 'http://localhost/wiki/'>
]>
<rdf:RDF
xmlns:rdf="&rdf;"
xmlns:rdfs="&rdfs;"
xmlns:owl ="&owl;"
xmlns:swivt="&swivt;"
xmlns:wiki="&wiki;"
xmlns:property="&property;">
Currently this is being replaced
<property:Office rdf:resource="&wiki;BX"/>
as this in my xslt.
<property:Office
rdf:resource="http://p13.itawiki.org/wiki/Special:URIResolver/BX"/>
I've been reading some old posts on this but I haven't been able to
key in on the right solution via google.
Regards
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--