xsl-list
[Top] [All Lists]

RE: Fixing entities

2003-04-01 03:16:45
Ah, nice solution!
Yep that might do the job
thanks

-----Original Message-----
From: owner-xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
[mailto:owner-xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com] On Behalf Of 
Stuart Brown
Sent: 01 April 2003 10:59
To: 'xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com'
Subject: RE: [xsl] Fixing entities


Hi Philip

I'm trying to tidy up some xml that contains html entities
such as named
&fract12; and number ½

The problem here is that your XML is parsed before it reaches the XSLT
processor, and all general entities are expanded by the parsing, so the XSLT
processor sees the replacement character, and is ignorant of the original
entity.

But you could hack this. If your utf mapping file is in XML itself[1], why
not perform an XSLT translation on THAT, and output (as a text file) a DTD
subset of entity definitions like so[2]:

<!-- File: redefines.ent -->
<!ENTITY fract12 "<character code='fract12'/>">

You then only need to call that file from the internal DOCTYPE declaration
of the file (where it supersedes the original declarations), e.g.:

<!-- File: myfile.xml -->
<!DOCTYPE myfile SYSTEM "original.dtd" [
<!ENTITY % redefines SYSTEM "redefines.ent">
%redefines;
]>
<myfile>
 <p>Blah blah &fract12;</p>
</myfile>

Run an identity transform on the XML, and the output will be as desired.

Hope this helps,

Stuart

[1] Or if not you could use Sebastian Rahtz's very useful unicode.xml doc at
http://www.oasis-open.org/cover/unicodeRahtz19981008.xml (more up-to-date
versions may be floating around). [2] This won't work with hashed character
reference entities, as I doubt you can redefine them.

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list



<Prev in Thread] Current Thread [Next in Thread>