xsl-list
[Top] [All Lists]

Re: [xsl] Need an XPath 2.0 expression that identifies a long block of uninterrupted non-blocking space characters in an XHTML document

2019-10-14 13:01:12
There was a problem when pasting. One correct expression is this:

for $nb in ' '
      return
        (p[. eq $nb and not(following-sibling::*[position() le
10][not(self::p)])
            and following-sibling::*[position() le 10][. eq $nb]
          ])[1]



On Mon, Oct 14, 2019 at 10:55 AM Dimitre Novatchev 
dnovatchev(_at_)gmail(_dot_)com <
xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:

for $nb in '&#160;'
      return
        (p[. eq $nb and following-sibling::*[position() le 10][self::p]][.
eq $nb])[1]

On Mon, Oct 14, 2019 at 10:08 AM Costello, Roger L. 
costello(_at_)mitre(_dot_)org <
xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:

Hi Folks,

As you may know, when a formatted email message is created in Outlook,
Outlook generates HTML under the hood.

I am trying to determine if a formatted email message has text at the
bottom of the email message that is separated from the rest of the email by
a lot of space. In other words, the text at the bottom of the underlying
HTML is preceded by a bunch of non-blocking space characters (&#160;).

Assume the HTML has been converted to XHTML.

I need an XPath 2.0 expression that identifies a long block of
non-blocking space characters.

Outlook generates HTML like that shown below. The non-blocking space
character is nested inside an <o:p> element, which is nested inside a <p>
element.

I came up with this XPath expression:

//p[o:p eq '&#160;'][count(following-sibling::*[position() le 10][name()
eq 'p'][o:p eq '&#160;']) ge 10][1]

It says, "Give me the first <p> element containing a non-blocking space
character such that there are at least 10 <p> elements that immediately
follow it, each containing a non-blocking space character." At least,
that's what I think it says. Note: 10 is an arbitrary number.

Questions:
1. Do you see any problems with the XPath expression?
2. Is there a better XPath expression?

<html xmlns:o="urn:schemas-microsoft-com:office:office">
    <p class="MsoNormal">top text<o:p/></p>
    <p class="MsoNormal"><o:p>&#160;</o:p></p>
    <p class="MsoNormal"><o:p>&#160;</o:p></p>
    <p class="MsoNormal"><o:p>&#160;</o:p></p>
    <p class="MsoNormal"><o:p>&#160;</o:p></p>
    <p class="MsoNormal"><o:p>&#160;</o:p></p>
    <p class="MsoNormal"><o:p>&#160;</o:p></p>
    <p class="MsoNormal"><o:p>&#160;</o:p></p>
    <p class="MsoNormal"><o:p>&#160;</o:p></p>
    <p class="MsoNormal"><o:p>&#160;</o:p></p>
    <p class="MsoNormal"><o:p>&#160;</o:p></p>
    <p class="MsoNormal"><o:p>&#160;</o:p></p>
    <p class="MsoNormal">bottom text<o:p/></p>
</html>



--
Cheers,
Dimitre Novatchev
---------------------------------------
Truly great madness cannot be achieved without significant intelligence.
---------------------------------------
To invent, you need a good imagination and a pile of junk
-------------------------------------
Never fight an inanimate object
-------------------------------------
To avoid situations in which you might make mistakes may be the
biggest mistake of all
------------------------------------
Quality means doing it right when no one is looking.
-------------------------------------
You've achieved success in your field when you don't know whether what
you're doing is work or play
-------------------------------------
To achieve the impossible dream, try going to sleep.
-------------------------------------
Facts do not cease to exist because they are ignored.
-------------------------------------
Typing monkeys will write all Shakespeare's works in 200yrs.Will they
write all patents, too? :)
-------------------------------------
Sanity is madness put to good use.
-------------------------------------
I finally figured out the only reason to be alive is to enjoy it.

XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list>
EasyUnsubscribe <http://lists.mulberrytech.com/unsub/xsl-list/782854> (by
email <>)



-- 
Cheers,
Dimitre Novatchev
---------------------------------------
Truly great madness cannot be achieved without significant intelligence.
---------------------------------------
To invent, you need a good imagination and a pile of junk
-------------------------------------
Never fight an inanimate object
-------------------------------------
To avoid situations in which you might make mistakes may be the
biggest mistake of all
------------------------------------
Quality means doing it right when no one is looking.
-------------------------------------
You've achieved success in your field when you don't know whether what
you're doing is work or play
-------------------------------------
To achieve the impossible dream, try going to sleep.
-------------------------------------
Facts do not cease to exist because they are ignored.
-------------------------------------
Typing monkeys will write all Shakespeare's works in 200yrs.Will they write
all patents, too? :)
-------------------------------------
Sanity is madness put to good use.
-------------------------------------
I finally figured out the only reason to be alive is to enjoy it.
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--