xsl-list
[Top] [All Lists]

RE: Switching off character entity resolution in XSL

2004-02-03 02:37:53
Hi Alan,

I realise there is some way of dealing with this with 
character substitutions before or after using something like 
sed, but this isn't really a great solution, particularly 
across platforms. Is there any way of manipulating the output 
using XSL, or alternatively switching off entity resolution 
in the parser? 

I quite often have to send XML files to people who claim to be able to
handle XML, but are processing it in an SGML environment which falls over
with any Unicode characters beyond the ASCII or Latin-1 equivalent ranges;
they tend to expect the old standard named entities for anything else (a
fairly reputable company has told me that my file was "not valid XML"
because it contained these characters). What has come to my rescue is the
xsl:character-map[1] functionality in XSLT 2.0 (currently implemented in
Saxon 7).  I have a simple stylesheet which runs over any XML, and uses the
character map to replace all non-ASCII characters with one of the ISO named
entities if one exists, or else just nothing.

This could leave a problem because, of course, I haven't declared the
entities in the DTD, and if I did I'd just declare them as resolving to the
relevant character, but I find that just omitting to output a DTD
declaration and leaving the file as "standalone" keeps them happy. Even
though this means that the file is now not valid XML, I suspect they're just
feeding it into an SGML processor which is happy to handle the named
entities. Having entered the mark-up world late enough to only have a brief
handling of SGML (at CCH UK, BTW), I'm glad to say I have only the faintest
idea what SDATA is, what it's supposed to do, and what all this business
with "[bull   ]" is.

This isn't going to actually "preserve" your entities as such, but will have
much the same effect.

Hope that helps,

Stuart

[1] See http://www.w3.org/TR/xslt20/#character-maps

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list