xsl-list
[Top] [All Lists]

Switching off character entity resolution in XSL

2004-02-02 20:11:29
Hello All,

Unlike what most people would use XSL for (i.e. conversion of XML to HTML
or other output format), I have a requirement to transform from one XML
structure to another (subsequent presentation rendering occuring way
downstream). No big deal I guess, but the annoying thing here is that by
the time an XML parser has done it's job as per the XML specification, all
those pesky character entities have been resolved (as defined in the DTD
for the source document) and the output contains square brackets.

Example:
source document contains:     •
After transformation:         [bull  ]    (of course, the entity declared
in the DTD is this, i.e. <!ENTITY bull "[bull  ]">)
What I would like:            &bull;

I really don't want to go messing with the DTD either, and I really don't
think a parser would like there being unparsed entities within an entity
declaration in a  DTD i.e. <!ENTITY bull &bull;> is illegal.

I realise there is some way of dealing with this with character
substitutions before or after using something like sed, but this isn't
really a great solution, particularly across platforms. Is there any way of
manipulating the output using XSL, or alternatively switching off entity
resolution in the parser? I've played with custom entity resolvers with
Java XML parsers (i.e. resolving URLs for example) but cannot see how this
could be used for external character entities, and also realise there is
some scope for writing a solution in something like JDOM - but what a pain!
That defeats the whole purpose of XSL. I have gotten used to a pretty good
compromise of using Saxon with the Xerces parser and the Norm Walsh entity
resolver classes if that's of any help.

Either there's a simple solution to this, it's something XML 2.0 (or
whatever is on the horizon) might address (which is no help for me really),
I'm on the wrong mailing list or I should just resort back to ("the good
ol' days of" - yes, sarcasm) Omnimark which was really good at "unparsing"
entities. I'm sure others experience similar problems so hopefully the
first option is the right one (i.e. easy ?).

Thanks very much,
Alan Hynes.






 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list