xsl-list
[Top] [All Lists]

RE: [xsl] How to select for ' in XPATH?

2009-08-05 13:57:13
I don't really know anything about the shell that you are using and any
escaping or unescaping that it is doing, so it's a bit hard to tell.
I used this one:
http://www.xmlsh.org

The general rule in XPath 2.0 is that if a string literal is enclosed
in single quotes, an apostrophe should be represented as a pair of
adjacent apostrophes.
I tried that hint as it was given by Martin, too.

In xmlsh this works:
$ xpath '/*/*/*[contains(normalize-space(.),"""")]' <tst.html
<p>apos and quot: ' " </p>
$ xpath '/*/*/*[contains(normalize-space(.),"''")]' <tst.html
<p>lt and gt: &lt; &gt; </p>
<p>apos and quot: ' " </p>
$

You are right, it is not clear what escaping/unescaping the shell does,
at least I do not see why the second xpath matches both <p>'s.


My real problem seems to be that I need a XPATH 1.0 solution since
I want to do this in a browser environment, right?


The real problem is as follows:
- open an arbitrary web page in Firefox browser

- with a bookmarklet do an arbitrary selection in that page
  (http://en.wikipedia.org/wiki/Bookmarklet)

- then the bookmarklet generates eg. the following xpath:
  "//*[contains(normalize-space(.),'xyz')]"
  where xyz is replaced by the actual selection data

- then Mozilla's document.evaluate() is used to determine the
  corresponding node in the DOM
  (
https://developer.mozilla.org/en/Introduction_to_using_XPath_in_JavaScript)

This all works really fine as long as there is no &apos; character in
the selection ...

It is just this case where I need to figure out how to pass the apos
character to document.evaluate(). For simplicity let us assume that
the selection contains the &apos; character, only.

The XPATH "//*[contains(normalize-space(.),''')]" is definitely wrong,
but what would be right?

Neither "//*[contains(normalize-space(.),'''')]" nor
"//*[contains(normalize-space(.),'\')]" works.]


Interestingly "//*[contains(normalize-space(.),'%20')]"
matches for &quot;

Sadly "//*[contains(normalize-space(.),'%27')]"
does not match for &apos;

This is the JavaScript statement for the evaluation:)]
e = document.evaluate(unescape(s),document,null,
                      XPathResult.FIRST_ORDERED_NODE_TYPE, null);

Any hint what can be done to make this work?
(I have no control over the webpage nor control over user selection)


Mit besten Gruessen / Best wishes,

Hermann Stamm-Wilbrandt
Developer, XML Compiler
WebSphere DataPower SOA Appliances
----------------------------------------------------------------------
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter
Geschaeftsfuehrung: Erich Baier
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294


                                                                           
             "Michael Kay"                                                 
             <mike(_at_)saxonica(_dot_)co                                       
      
             m>                                                         To 
                                       
<xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com>   
             08/05/2009 07:20                                           cc 
             PM                                                            
                                                                   Subject 
                                       RE: [xsl] How to select for &apos;  
             Please respond to         in XPATH?                           
             xsl-list(_at_)lists(_dot_)mu                                       
      
              lberrytech.com                                               
                                                                           
                                                                           
                                                                           
                                                                           





I don't really know anything about the shell that you are using and any
escaping or unescaping that it is doing, so it's a bit hard to tell. The
general rule in XPath 2.0 is that if a string literal is enclosed in single
quotes, an apostrophe should be represented as a pair of adjacent
apostrophes.

Regards,

Michael Kay
http://www.saxonica.com/
http://twitter.com/michaelhkay



-----Original Message-----
From: Hermann Stamm-Wilbrandt [mailto:STAMMW(_at_)de(_dot_)ibm(_dot_)com]
Sent: 05 August 2009 18:04
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: [xsl] How to select for &apos; in XPATH?


Hello,

I tried to select for special characters with XPATH below.
While I succeeded for some I am unable to select for the
&apos; character (') and got an error message.

Any hint how this can be done?

$ xmlsh
$ cat tst.html
<html><body>
<p>lt and gt: &lt; &gt; </p>
<p>apos and quot: &apos; &quot; </p>
</body></html>
$ tidy -q -xml tst.html;
<html>
  <body>
    <p>lt and gt: &lt; &gt;</p>
    <p>apos and quot: ' "</p>
  </body>
</html>

$ xpath "/*/*/*[contains(normalize-space(.),'<')]" <tst.html
<p>lt and gt: &lt; &gt; </p> $ xpath
"/*/*/*[contains(normalize-space(.),'>')]" <tst.html <p>lt
and gt: &lt; &gt; </p> $ xpath
"/*/*/*[contains(normalize-space(.),'\"')]" <tst.html <p>apos
and quot: ' " </p> $ xpath
"/*/*/*[contains(normalize-space(.),'\'')]" <tst.html
Exception running: xpath
net.sf.saxon.s9api.SaxonApiException: XPath syntax error at char 34 in
{...ontains(normalize-space(.),...}:
    Unmatched quote in expression
$


Mit besten Gruessen / Best wishes,

Hermann Stamm-Wilbrandt
Developer, XML Compiler
WebSphere DataPower SOA Appliances
----------------------------------------------------------------------
IBM Deutschland Research & Development GmbH Vorsitzender des
Aufsichtsrats: Martin Jetter
Geschaeftsfuehrung: Erich Baier
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--



--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--




--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--