xsl-list
[Top] [All Lists]

RE: XSLT outputting to XHTML to display character entities?

2003-04-23 01:11:19
From: owner-xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
[mailto:owner-xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com]On Behalf Of 
Lars Huttar
Sent: Wednesday, April 23, 2003 12:37 AM
To: XSL-List (E-mail)
Subject: [xsl] XSLT outputting to XHTML to display character entities?


Hi all,
Sorry if this is an FAQ, but I've looked and can't find info on it.

I have some non-7-bit characters in my source XML document, e.g. Ø
(O with slash).
My XSL stylesheet is processing the text and outputting it to an
HTML document.  I had wanted to output to XHTML, thinking that was
better than HTML.  For this reason I used
      <xsl:output method="xml" encoding="UTF-8" indent="yes"
      doctype-public="-//W3C//DTD XHTML 1.0 Transitional//EN"

doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"/>
(I also tried encoding="ISO-8859-1".)

The result was that the non-7-bit characters were output in raw
form, not escaped, into the resulting (x)html.  This makes sense
for XML and is not unexpected.  However when I tried to view the
result in IE 6.0, the characters did not show up correctly.
(They appeared as a box, or as A~, depending on whether I set
encoding to UTF-8 or ISO-8859-1.)

When I change method to "html", the special characters were escaped
to &Oslash; and were displayed correctly by the browser.

My question, then, is: is IE failing to display XHTML correctly?

Yes, in some cases IE is failing to process XHTML correcly (basically, it
doesn't really support it). But no, this has nothing to do with your
problem.

Isn't it true that XHTML (being XML) is allowed to contain any
Unicode data (except & and < of course unless escaped)?

Yes.

And if so shouldn't a browser be able to display it correctly?

Yes, but it requires that encoding is declared properly in *several* places,
such as

- HTTP content-type header,
- XML declaration
- HTTP META tag

IE will work fine if the encoding declaration is correct and present in all
three places.

I know the browser doesn't lack the font for this character
because it shows up right when represented as &#160; in the html.

For now, it's not a problem to me... I can just use method="html"
and generate html.  But what's the Right way to do this?

You will need to find out where the encoding information is lost, or whether
there are mismatches.

Is there a way to generate XHTML using XSL and have special characters in
the output serialized using &#...; character entities?

You can try to specify a plain-ASCII output encoding (this will force the
serializer to use entitities for all non-ASCII characters). However, XSLT is
not guaranteed to support any output encoding other than UTF-8 and UTF-16
(so your mileage my vary depending on the processor).

Julian


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list



<Prev in Thread] Current Thread [Next in Thread>