xsl-list
[Top] [All Lists]

RE: Switching off character entity resolution in XSL

2004-02-03 03:25:00
I'm afraid the simple answer is the ugly one: just preprocess the entity
references with a text editor to read "$#$bull;" instead of "•". No
point banging your head against the wall to find something more elegant,
it will only give you a headache.

Michael Kay


-----Original Message-----
From: owner-xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com 
[mailto:owner-xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com] On Behalf Of 
AHynes(_at_)cch(_dot_)com(_dot_)au
Sent: 03 February 2004 03:11
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: [xsl] Switching off character entity resolution in XSL


Hello All,

Unlike what most people would use XSL for (i.e. conversion of 
XML to HTML or other output format), I have a requirement to 
transform from one XML structure to another (subsequent 
presentation rendering occuring way downstream). No big deal 
I guess, but the annoying thing here is that by the time an 
XML parser has done it's job as per the XML specification, 
all those pesky character entities have been resolved (as 
defined in the DTD for the source document) and the output 
contains square brackets.

Example:
source document contains:     •
After transformation:         [bull  ]    (of course, the 
entity declared
in the DTD is this, i.e. <!ENTITY bull "[bull  ]">)
What I would like:            &bull;

I really don't want to go messing with the DTD either, and I 
really don't think a parser would like there being unparsed 
entities within an entity declaration in a  DTD i.e. <!ENTITY 
bull &bull;> is illegal.

I realise there is some way of dealing with this with 
character substitutions before or after using something like 
sed, but this isn't really a great solution, particularly 
across platforms. Is there any way of manipulating the output 
using XSL, or alternatively switching off entity resolution 
in the parser? I've played with custom entity resolvers with 
Java XML parsers (i.e. resolving URLs for example) but cannot 
see how this could be used for external character entities, 
and also realise there is some scope for writing a solution 
in something like JDOM - but what a pain! That defeats the 
whole purpose of XSL. I have gotten used to a pretty good 
compromise of using Saxon with the Xerces parser and the Norm 
Walsh entity resolver classes if that's of any help.

Either there's a simple solution to this, it's something XML 
2.0 (or whatever is on the horizon) might address (which is 
no help for me really), I'm on the wrong mailing list or I 
should just resort back to ("the good ol' days of" - yes, 
sarcasm) Omnimark which was really good at "unparsing" 
entities. I'm sure others experience similar problems so 
hopefully the first option is the right one (i.e. easy ?).

Thanks very much,
Alan Hynes.






 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list



 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list