xsl-list
[Top] [All Lists]

Re: Newbie encoding query

2002-12-04 00:24:49
Satish, L. Gnanendra wrote:
The UserManual.XSL has a parameters which has to have a trademark
symbol(#153):

*Some* HTML user agents allow one to illegally use ™ to refer to
codepoint 153 of the windows-1252 encoding, but this is wrong for two reasons:

1. The number in a numeric character reference in XML or HTML is, by
definition, a character's Unicode codepoint. Unicode code point 153
corresponds to a legacy control character: SINGLE GRAPHIC CHARACTER INTRODUCER
(SGCI), which is not what you want.

2. Although not enforced, HTML's SGML declaration disallows Unicode characters
in the range 127-159, in addition to those that are disallowed by XML. You
cannot have them in a conforming HTML document, not even by reference.

™ is the trademark symbol. You must use that in your XML and XSLT.
Do not use ™.

<xsl:output method="html" encoding="UTF-8"/>

My problem is that, when it is viewed in a IE6 browser, the parameter "GUI"
displays:
User Interface of PrismaÂ(tm) which is not it should be. I want to eliminate
"Â" char from the html view. how do i go about this?

You asked for UTF-8 output. If you represent the Unicode character #153 in 
UTF-8, you get 2 bytes: <C2 99>. If you then view this output in an 
environment that does not recognize UTF-8, those bytes will be displayed as
characters from some other encoding. In your case, they are being mistakenly 
assumed to be windows-1252 bytes.

Really, you wanted Unicode character #8482, which in UTF-8 is 3 bytes:
<E2 C4 A2> which is going to look like an even uglier mess until you
correct the other problem: your web browser does not know that the
HTML is UTF-8 encoded.

Your XSLT processor should have added <meta http-equiv="Content-Type"
content="text/html;charset=UTF-8"> to the <head> of your HTML output. This
meta tag will tell your browser that the document's bytes are UTF-8 encoded
characters.

I suspect that your XSLT processor did not do this because you did not put a
<head> in your document, which is an HTML error anyway. Fix that. All HTML 
documents require a head, title and body:

<html>
  <head>
    <title>...</title>
  </head>
  <body>
    ...
  </body>
</html>


Mike

-- 
  Mike J. Brown   |  http://skew.org/~mike/resume/
  Denver, CO, USA |  http://skew.org/xml/

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list



<Prev in Thread] Current Thread [Next in Thread>