At 05:09 AM 2/3/2004, Richard wrote:
... My strategy is
to live with the fact that the parser has carried out all the entity
mappings, and to use a "mappings" document containing entries like this:
<char>
<name>Delta</name>
<value>Δ</value>
<unicode>0394</unicode>
<description>Delta Dec:916 </description>
<mapping>[capital Delta]</mapping>
<!--U0394 /Delta capital Delta, Greek -->
</char>
...
Yes, it is slow and clumsy, and yes, it does use the deprecated
disable-output-escaping, but it does work ...
In my view this is a perfectly reasonable approach, as long as one is clear
on the dependencies it introduces -- by using XSLT to drive the serializer,
one in effect requires that the result be written out to a file (using a
processor that implements d-o-e, of course), but since that's built into
the requirement to begin with, it's not a big deal. Accordingly, I don't
consider it an abuse of d-o-e -- just an application of XSLT+serializer as
string writer bound to XSLT's role as a transformer. (In fact when I've
implemented this solution to the entity-writing problem, I've deliberate
kept the d-o-e operations separate from transformation logic, pipelining
two different stylesheets. This way the entity-writing routine is portable.)
Also see Zarella Rendon and Tony Coates on this issue:
http://www.xml.com/pub/a/2003/01/02/xmlchar.html
Also, Mike wrote:
I'm afraid the simple answer is the ugly one: just preprocess the entity
references with a text editor to read "$#$bull;" instead of "•". No
point banging your head against the wall to find something more elegant,
it will only give you a headache.
This approach, wrapping your transformation in non-XSLT "entity
escaping/un-escaping" routines, may perform better (faster tools), and has
the virtue of architectural clarity. It does introduce other local
dependencies, of course, but for this kind of a problem that's not really
an issue, is it?
Cheers,
Wendell
Richard Light
>Example:
>source document contains: •
>After transformation: [bull ] (of course, the entity declared
>in the DTD is this, i.e. <!ENTITY bull "[bull ]">)
>What I would like: •
>
>I really don't want to go messing with the DTD either, and I really don't
>think a parser would like there being unparsed entities within an entity
>declaration in a DTD i.e. <!ENTITY bull •> is illegal.
>
>I realise there is some way of dealing with this with character
>substitutions before or after using something like sed, but this isn't
>really a great solution, particularly across platforms. Is there any way of
>manipulating the output using XSL, or alternatively switching off entity
>resolution in the parser? I've played with custom entity resolvers with
>Java XML parsers (i.e. resolving URLs for example) but cannot see how this
>could be used for external character entities, and also realise there is
>some scope for writing a solution in something like JDOM - but what a pain!
>That defeats the whole purpose of XSL. I have gotten used to a pretty good
>compromise of using Saxon with the Xerces parser and the Norm Walsh entity
>resolver classes if that's of any help.
>
>Either there's a simple solution to this, it's something XML 2.0 (or
>whatever is on the horizon) might address (which is no help for me really),
>I'm on the wrong mailing list or I should just resort back to ("the good
>ol' days of" - yes, sarcasm) Omnimark which was really good at "unparsing"
>entities. I'm sure others experience similar problems so hopefully the
>first option is the right one (i.e. easy ?).
>
>Thanks very much,
>Alan Hynes.
>
>
>
>
>
>
> XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
>
--
Richard Light
SGML/XML and Museum Information Consultancy
richard(_at_)light(_dot_)demon(_dot_)co(_dot_)uk
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
======================================================================
Wendell Piez
mailto:wapiez(_at_)mulberrytech(_dot_)com
Mulberry Technologies, Inc. http://www.mulberrytech.com
17 West Jefferson Street Direct Phone: 301/315-9635
Suite 207 Phone: 301/315-9631
Rockville, MD 20850 Fax: 301/315-8285
----------------------------------------------------------------------
Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list