On page http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8
is an explanation on how UTF-8 characters are built.
From that you see that 0xE0 becomes 0xC3A0
regards
Kaarle
----- Original Message -----
From: "Peter Hollingsworth" <peter(_at_)hollingsworth(_dot_)net>
To: <xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com>
Sent: Wednesday, January 14, 2004 8:05 AM
Subject: Re: [xsl] escaping an accented character
OK, sorry to be so dense, but your reference says the following:
We recommend that user agents adopt the following convention
for handling non-ASCII characters in such cases:
1. Represent each character in UTF-8 (see [RFC2279]) as one or more
bytes.
2. Escape these bytes with the URI escaping mechanism (i.e., by
converting
each byte to %HH, where HH is the hexadecimal notation of the byte
value).
So following the instructions, I take the character for à, which according
to everything I can find is 224. I convert this to hex, E0. Voila, the
escaped value is %E0. Where do you get %C3%A0 out of this? Obviously the
xsl parser agrees with you, but I don't see where the value is coming
from.
Thanks.
--Peter
At 10:09 PM 1/13/2004 +0100, you wrote:
Peter Hollingsworth wrote:
The character à ('a' with a grave accent) appears in a node in my XML.
When I use an XSLT to display the node in an href for link in an html
page, the character gets escaped as %C3%A0, which is completely wrong
(it
should be escaped as %E0). Similar problems occur with all accented
characters.
It's exactly the right thing. See:
<http://www.w3.org/TR/html401/appendix/notes.html#h-B.2.1>
Both the XSL and the XML file have encoding="UTF-8" (unicode, I
believe).
That's irrelevant here.
Any suggestions? Thanks.
Fix the server, if you can. The URI is just fine.
Julian
--
<green/>bytes GmbH -- http://www.greenbytes.de -- tel:+492512807760
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list