Re: Can I suppress entity substitution in XSLT?
2003-07-28 04:33:30
I'm not at the office at the moment, so I don't have all the details in
front of me, but I've been
using an extension from xalan which allows me to specify that a character
reference number
should be serialized as a given character entity, so for example, character
&# 160; should be
output as & nbsp;. Since most of my work is in the OEB space, which has a
specific set of
named entities, that can be very useful.
Hope that helps!
Chris Loschen
At 02:27 PM 7/26/2003 -0400, you wrote:
Taro,
To add to the picture Ken describes:
Since XSLT operates only on the XPath tree, which knows nothing of
entities, how the characters are then represented in an output file
written by a serializer of an XPath tree, is generally outside the scope
of the XSLT processor. This separation gives a clean interface between the
"tree transformation" XSLT performs, and the business of file-writing
(XML, HTML or anything else). The down side of this is that XSLT cannot,
directly, answer requirements like "keep my entity references (to
characters)". The up side is that we know where to turn if we *do* want to
represent characters in serialized output in an unorthodox way (not "keep"
them, but anyhow write them out where we can) -- namely, the serializer.
Not all XSLT processes terminate by serializing their output, and writing
the output tree to a file is a necessary and desirable operation only some
(maybe most) of the time. However, if you are doing this, you can do any
of three things:
1. Remove the problem from XSLT's scope altogether. Use a post-processing
routine, such as a Perl or Python script, or sed, to perform substitutions
of characters with named entities in the output files after the XSLT
processor writes them. Maintaining the well-formedness of your output is
your concern.
2. Hack the serializer. You could thus extend XSLT with, for example, a
custom output method. Your serializer would intercept characters you
wanted to represent with entities, and take care of the escaping. Back
when James Clark's XT was the de-facto reference implementation for XSLT,
he demonstrated how to do something not unlike this.
3. Drive the serializer from within XSLT. Perhaps considered by some to be
bad form since it crosses the line between XSLT processing and
serialization, you can nonetheless do this with the optional
disable-output-escaping feature, if you have it (it's available in Xalan).
I'd call this a legitimate use of d-o-e, assuming of course you understand
this kind of processing is limited in its application (a) to scenarios
where file-writing is part of the pipeline, and (b) in the engines it will
work on -- i.e., it's not as portable as XSLT generally. For this reason
and others, you may want to implement this in a separate stylesheet from
your main transform, and run it as the terminal process in a chain.
See http://www.biglist.com/lists/xsl-list/archives/200110/msg00115.html
for more hints.
Cheers,
Wendell
At 07:40 PM 7/25/2003, you wrote:
At 2003-07-25 18:03 -0400, Taro Ikai wrote:
I sometimes want to keep the character entities as they are.
Is that outside of XSLT specification?
Keeping the characters represented by the entities is within the XSLT 1.0
specification.
Keeping the syntax used to represent the character entities in the file
used to create the XPath node tree is not within the XSLT 1.0 specification.
For all entities, the substitutions are made by the XML processor used by
the XSLT processor and the XSLT processor isn't told of any syntax that
might have been used for any of the markup ... all it sees is the
information as understood by the XML processor after accommodating the markup.
I hope this helps.
......................... Ken
___&&__&_&___&_&__&&&__&_&__&__&&____&&_&___&__&_&&_____&__&__&&_____&_&&_
"Thus I make my own use of the telegraph, without consulting
the directors, like the sparrows, which I perceive use it
extensively for a perch." -- Thoreau
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
|
|