David Carlisle wrote:
The exception is if you are doing the html output method, and writing an
html attribute that is a URI (a href= or <img src= then that attribute
is URL encoded (using %ab hex quotes)
Specifically, the XSLT spec suggests that when using the HTML output method,
that the XSLT processor do some escaping of *non-ASCII* characters in the
href, src, codebase, or other URI-type attribute value. The XSLT processor is
not required to do so -- in my opinion, it shouldn't bother, because it is the
author's responsibility to ensure that the value is a URI reference, not an
IRI.
Non-ASCII characters are not the only characters that need to be escaped in a
URI, but they comprise a set that can never appear unescaped, so the
suggestion to escape them is not harmful. The escape mechanism recommended is
UTF-8 based, so for a character such as Å (Latin capital letter A with carat)
it would be %C3%85.
I think this is not the first time I've seen a report of an XSLT processor
applying a bit more escaping than is called for, or doing non-UTF-8 based
escaping, but it is the first time I've heard of it happening in a non-URI
attribute value.
Mike
--
Mike J. Brown | http://skew.org/~mike/resume/
Denver, CO, USA | http://skew.org/xml/
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list