Are you sure you are viewing you result document in an editor that
supports uicode? I know I had the same problem, and I thought xsltproc
was broken. But xalan may be outputting the entities as true unicode
charactes. Your editor may be set for Latin-1 encoding, and will read
the first byte of the unicode character and produce the strange
results you posted below.
I fixed the problem when I upgraded my editor to support unicode. Once
I set the encoding to utf-8, the strange results went away.
Paul
On Tue, May 20, 2003 at 06:00:02PM +0900,
Edward(_dot_)Middleton(_at_)nikonoa(_dot_)net wrote:
I've got XML documents, marked up to a DTD, and calling character entity
sets. When I run through the XSLT processor (xalan) to output another XML
file I find the entities have been converted to something different, and
fairly inconsistently.
What I would like to achieve is having “ ü in my input xml, and
these entities still being untouched in my output. Can anyone advise how I
achieve this please?
What I'm getting are (“, ü), or (ââ?¬Å? and Ã?¼), or
(“
and ü), depending on character encoding settings and entity sets used. Am I
missing something?
“ ü are not predefined character entities.
http://www.w3.org/TR/REC-xml#sec-predefined-ent
They apear as literal text strings
'&' 'l' 'd' 'q' 'u' 'o' ';'
and so when searialized to XML the '&' character is replaced by '&' giving
“
if you are making an HTML document and want these character entities you
should specify the correct character entity and put.
<xsl:output method="html" version="1.0" encoding="ISO-8859-1">
Edward Middleton
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
--
************************
*Paul Tremblay *
*phthenry(_at_)earthlink(_dot_)net*
************************
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list