xsl-list
[Top] [All Lists]

Re: [xsl] Need XPath 2.0 expression which returns a non-empty paragraph element that is preceded by a long uninterrupted series of empty paragraph elements

2019-11-25 15:14:14
Roger,

Try this:

 //p[.!=' '][exists(preceding-sibling::p)]
[empty(preceding-sibling::p[position() le 20][not(. =' ')])]

... because the preceding-sibling axis is a reverse axis, we can apply a 
position test backwards.

Note that for purposes of this XPath, <p/> is not empty as it has no character 
160. So adjustments are probably called for.

Cheers, Wendell

-----Original Message-----
From: Costello, Roger L. costello(_at_)mitre(_dot_)org 
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> 
Sent: Monday, November 25, 2019 2:39 PM
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: [xsl] Need XPath 2.0 expression which returns a non-empty paragraph 
element that is preceded by a long uninterrupted series of empty paragraph 
elements

Hi Folks,

I want to know if an XHTML document contains a non-empty paragraph (p) element 
that is preceded by a long, uninterrupted series of paragraph elements, each 
containing just a non-blocking space character (decimal 160). Let's assume that 
"long" means 20. For example, here is an excerpt of an XHTML document:

<body>
    <p>Text at top</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>Text at bottom</p>
</body>

The query should return this:

<p>Text at bottom</p>

because it is preceded by a long, uninterrupted series of paragraph elements, 
each containing just a non-blocking space character.

Here's a query that returns the desired paragraph element:

//p[string-length(.) gt 1][count(preceding-sibling::p[. eq '&#160;']) ge 20]

However, if I insert a non-empty paragraph element in the middle of that long 
series:

<body>
    <p>Text at top</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>Other text</p>  <------------
    <p>&#160;</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>&#160;</p>
    <p>Text at bottom</p>
</body>

then my query erroneously returns the same paragraph element. That is, my XPath 
query does not account for the requirement that the long series of paragraph 
elements be uninterrupted. How to write an XPath 2.0 query for this?

/Roger
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--