xsl-list
[Top] [All Lists]

Re: Recognized Unicode characters?

2005-05-09 06:48:46
Hi,

Maybe your default font of your browser doesn't support the character you are trying to see. I cannot reproduce the problem. Using output method HTML, the XSL processor (Xalan) converts the 8212 to — when writing us-ascii and some utf-8 byte sequence when writing utf-8. I see either garbage (when it is utf-8 and there is no meta tag specifying the encoding) or just the character you are looking for. I saw the square box in none of the cases I tested...

Cheers,
Geert

Thanks for responding, but I think you guys lost me.
Here is the xslt header info I used:

 <?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
 <xsl:output method="html"/>

I set output to HTML because that is the output I am creating. (isn't this right?)

As for the encoding, I have to admit I am confused. I picked UTF-8 mostly due to general recommendations for its use in learning-xml books and websites (and that it is the default), but none I have seen explain why with any detail or why anyone might use something different. The special characters in my source xml file are all character references to the Unicode numbers (&#___;, etc.)

As I understand it, shouldn't the XSLT processor know from the "encoding" attribute that the references will be to Unicode numbers and read them correctly as those characters. I also understand that the processor has some flexibility in how it outputs the text, but that it will often output special characters as entity references (e.g., the "&" symbol as "&amp;").

So, I am still confused why a Unicode reference to #8212 won't output correctly? The ouput displays a square box in both the browser (IE6) as well as in the HTML source itself (viewed via Windows notepad).

> > > Shouldn't that be <xsl:output encoding="US-ASCII"... for safety?
> >
> > Neither is completely safe of course,
>

The spec only requires support for UTF-8 and UTF-16, anything else is
optional.

I personally use "iso-646" as the name of this encoding. The differences are
immaterial (different names for some of the characters, I believe) but I
prefer international standards as a matter of principle.

Michael Kay
http://www.saxonica.com/




--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--




--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--




--
=====================================
NB: het Daidalos kantoor is sinds 22 april
jl. gevestigd op een nieuw adres:

Daidalos BV
Hoekeindsehof 1 - 4
2665 JZ Bleiswijk
tel: +31 (0)10 850 12 00
fax: +31 (0)10 850 11 99

Bovenstaand adres is tevens het postadres.
======================
Geert(_dot_)Josten(_at_)Daidalos(_dot_)nl
IT-consultant at Daidalos BV

http://www.daidalos.nl/

GPG: 1024D/12DEBB50

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--