Well, a predicate using name()='p' is bad news because it depends on namespace
prefixes, which are arbitrary. Use self::p, assuming it's a no-namespace
element, or self::xhtml:p if its in the XHTML namespace.
You could also do something like
//p[o:p eq ' '][every $p in following-sibling::*[position() le 10]
satisfies $p[self::p/o:p eq ' ']]
Michael Kay
On 14 Oct 2019, at 18:09, Costello, Roger L. costello(_at_)mitre(_dot_)org
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:
Hi Folks,
As you may know, when a formatted email message is created in Outlook,
Outlook generates HTML under the hood.
I am trying to determine if a formatted email message has text at the bottom
of the email message that is separated from the rest of the email by a lot of
space. In other words, the text at the bottom of the underlying HTML is
preceded by a bunch of non-blocking space characters ( ).
Assume the HTML has been converted to XHTML.
I need an XPath 2.0 expression that identifies a long block of non-blocking
space characters.
Outlook generates HTML like that shown below. The non-blocking space
character is nested inside an <o:p> element, which is nested inside a <p>
element.
I came up with this XPath expression:
//p[o:p eq ' '][count(following-sibling::*[position() le 10][name() eq
'p'][o:p eq ' ']) ge 10][1]
It says, "Give me the first <p> element containing a non-blocking space
character such that there are at least 10 <p> elements that immediately
follow it, each containing a non-blocking space character." At least, that's
what I think it says. Note: 10 is an arbitrary number.
Questions:
1. Do you see any problems with the XPath expression?
2. Is there a better XPath expression?
<html xmlns:o="urn:schemas-microsoft-com:office:office">
<p class="MsoNormal">top text<o:p/></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">bottom text<o:p/></p>
</html>
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--