Gary,
At 09:56 AM 3/14/2006, you wrote:
On 14/03/06, andrew welch <andrew(_dot_)j(_dot_)welch(_at_)gmail(_dot_)com>
wrote:
> If you are outtputting as XML where are the 's coming from?
>
> Are you writing them out manually in the stylesheet - as is in &nbsp;?
Nope but thanks for the clue. The *input* is double escaped (I must
stop trusting a browser when viewing the output). That is I've
actually got &nbsp; as the input which explains why there is no
conversion going on. Later on in a different transform I do a
saxon:parse which then obviously can't find the reference to a
and therefore throws an exception.
Oh the input is *double* escaped? Fun.
Two available options:
1. Isolate a transformation step to write a file in which the
double-escaping is removed, in effect by resolving the "& amp;"
entity to "&" so the file presents an honest "& nbsp;" -- then parse
as normally. But this step in the pipeline has either to write a file
or pass the data through, say, a SAX filter -- it has to serialize
the data somehow, for reparsing: it can't work completely within
XSLT's world of trees (the logical view). As Mike just observed,
you're having to work with the lexical layer of the markup before it
represents what it's supposed to represent (what you, but not the
computer, knows it "actually" represents through the double-escaped entities).
2. Use string processing. Since you're using XSLT2.0 this is a
reasonable option. A regular expression could be used to match the
fake entities and turn them into something more useful. Probably this
process would have to write a file too, to be parsed again, unless
you used some kind of internal lookup table to take the place of the
set of entity declarations (which are only available to a parser).
I hesitate to say more, as XSLT 2.0 gives much better facilities for
handling such things than 1.0 did. (I could tell you about 1.0
tricks, but why?) But since I haven't tried them out myself, I can
only direct your attention to them.
Note that both these approaches assume that your files actually
parse. The error message you reported before suggests they don't.
But maybe you have the unescaping thing working and need to invoke
the entity declarations on the output to get it to parse properly --
that error message was upon parsing the *output*? (You're not the
only one confused now.)
Cheers,
Wendell
======================================================================
Wendell Piez
mailto:wapiez(_at_)mulberrytech(_dot_)com
Mulberry Technologies, Inc. http://www.mulberrytech.com
17 West Jefferson Street Direct Phone: 301/315-9635
Suite 207 Phone: 301/315-9631
Rockville, MD 20850 Fax: 301/315-8285
----------------------------------------------------------------------
Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--