xsl-list
[Top] [All Lists]

RE: How to URL encode?

2005-04-23 01:36:47
Well, maybe I'm misstating the type of encoding I need. 
Here's the code  
again (without any extraneous HTML):

<xsl:variable name="ItemLink" select="concat($MediaPlexURL, Link)"/>
{$ItemLink}

Here's what it outputs (I see that it encoded the ampersands, 
nice but  
I need more than just that):

http://cgi.ebay.com/ws/eBayISAPI.dll?ViewItem&amp;item=7151201
092&amp; 
category=47351

Here's what I need it to output (note the : and ?):

http%3A//cgi.ebay.com/ws/eBayISAPI.dll%3FViewItem&amp; 
item=7151201092&amp;category=47351


Firstly, the URL encoding should be done by the HTML serializer if the URL
is used as the value of an HTML URL attribute, such as <a href="">. We can't
see from this snippet whether you are using HTML serialization: it depends
on the <xsl:output>. The references that MDP pointed to suggest that
Sablotron does do URL encoding as the spec requires - the complaint was from
a user who didn't like the behaviour defined in the XSLT spec.

Secondly, the URL encoding defined in the XSLT specification does not %HH
encode any ASCII characters. The encoding rules defined in RFC 2396 require
escaping only for characters that aren't being used with their special
meaning in URLs, and the rules for the HTML output method restrict it
further to non-ASCII characters. This means it won't escape the ":" or "?"
in your example. If you escape the ":" in "http://";, then the resulting
string is no longer parseable as a URL, so it's not clear why you want to do
it. (There are use cases for this, e.g. when you want the value of a query
parameter in a URL to be itself a URL, but you need to explain why you want
this unusual behaviour.)

Finally, the message that MDP pointed to discusses one of the problems with
URL escaping, namely the question of what encoding is assumed. RFC 2396
allows you to encode the special characters using any encoding you like, and
then %HH escape the resulting octets. Many more recent standards, including
XSLT, require the encoding to be UTF-8, and I believe that most modern
browsers will handle this; for safety, however, it's probably best to use
UTF-8 as the HTML document encoding at the same time, so that there's no
confusion. This encoding question isn't really relevant to your problem as
the characters you want to escape are ASCII.

I think the question of which processor you are using is a red herring here
- there's no evidence so far that your processor isn't behaving
conformantly.

Michael Kay
http://www.saxonica.com/ 



--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--



<Prev in Thread] Current Thread [Next in Thread>