xsl-list
[Top] [All Lists]

Re: Maintaining character entities

2003-05-20 14:26:21
Are you sure you are viewing you result document in an editor that
supports uicode? I know I had the same problem, and I thought xsltproc
was broken. But xalan may be outputting the entities as true unicode
charactes. Your editor may be set for Latin-1 encoding, and will read
the first byte of the unicode character and produce the strange
results you posted below.

I fixed the problem when I upgraded my editor to support unicode. Once
I set the encoding to utf-8, the strange results went away.

Paul

On Tue, May 20, 2003 at 06:00:02PM +0900, 
Edward(_dot_)Middleton(_at_)nikonoa(_dot_)net wrote:


I've got XML documents, marked up to a DTD, and calling character entity
sets. When I run through the XSLT processor (xalan) to output another XML
file I find the entities have been converted to something different, and
fairly inconsistently. 

What I would like to achieve is having “ ü in my input xml, and
these entities still being untouched in my output. Can anyone advise how I
achieve this please?

What I'm getting are (“, ü), or (ââ?¬Å? and Ã?¼), or 
(“
and ü), depending on character encoding settings and entity sets used. Am I
missing something?


“ ü are not predefined character entities. 
http://www.w3.org/TR/REC-xml#sec-predefined-ent

They apear as literal text strings

'&' 'l' 'd' 'q' 'u' 'o' ';'

and so when searialized to XML the '&' character is replaced by '&' giving

“

if you are making an HTML document and want these character entities you 
should specify the correct character entity and put.

<xsl:output method="html" version="1.0" encoding="ISO-8859-1">



Edward Middleton


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list

-- 
************************
*Paul Tremblay         *
*phthenry(_at_)earthlink(_dot_)net*
************************

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list



<Prev in Thread] Current Thread [Next in Thread>