[Top] [All Lists]

Re: [xsl] Parsing XPath in XSLT?

2020-03-26 06:34:19
On 25/03/2020 17:31, Wendell Piez wapiez(_at_)wendellpiez(_dot_)com wrote:
I am currently having to interpret XPath or (more likely) an XPath
subset into an abstract representation that can be rewritten into
various forms. Naturally I would like to do this out of a parse tree
or the functional equivalent, represented in some sort of XML, since
serializing that back out is easy enough. It is producing that tree
that is a problem. I need a parser for XPath or if not for all of
XPath, then at least for my subset -- which includes namespaces. So
even if partial the model must expose names and namespaces to the
extent that a path rewriter can (for example) map into a new set of
namespace prefixes --

Any thoughts? Open source projects I should take a look at? Have the
community-standards initiatives captured any good work in this area?

The simplest is to use Gunther Rademacher's REx parser (https://www.bottlecaps.de/rex/) to generate an XSLT parser for XPath3.1. To do this download the sample grammar for XPath31 (left hand column on the page) then generate the parser using XSLT as the target, backtracking on, and parse tree checked on Generate Code. Hitting 'Generate' should produce a download of a file xpath-31.xslt, which contains internally a function p:parse-XPath($expression as xs:string) as element() [xmlns:p="xpath-31"].

Evaluating this function produces the parse tree (assuming of course the syntax of the expression is correct), as an XML tree where the element names correspond to the recursive Grammar productions, with leaves of literals, names etc. and tokens. So for example, 1 to 5 parses as the deep tree:

                                <RangeExpr> .......
   <IntegerLiteral>1</IntegerLiteral> .........

Namespace bearing terms such as charlie, bar:fred generate <QName>charlie</> <QName>bar:fred</> leaves, so all the information is still preserved.

The tree is easily manipulated with XSLT, and the inversion to valid XPath expression strings can be processed pretty simply, by something along the lines of:

    <xsl:mode name="parse2text" on-no-match="shallow-skip"/>
    <xsl:template match="TOKEN[. = ('to')]" mode="parse2text"> {.} </xsl:template>     <xsl:template match="TOKEN[. = (',')]" mode="parse2text">{.} </xsl:template>     <xsl:template match="TOKEN|Literal|QName" mode="parse2text">{.}</xsl:template>

Using this it is pretty simple to write a small stylesheet that processes:

   <samples xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
   xmlns:bar="BARBER" xmlns:charlie="CHARLIE" xmlns:delta="DELTA" >
        <remap from="xsl" to="charlie"/>
        <remap from="bar" to="delta"/>
        <remap from="delta" to="bar"/>
        <xpath>charlie, xsl:foo, $bar, xsl, $bar:fred</xpath>
        <xpath>map{'a': 1 to 5, $b : delta:X}</xpath>

and produces a result:

   <samples xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
       <remap from="xsl" to="charlie"/>
       <remap from="bar" to="delta"/>
       <remap from="delta" to="bar"/>
          <source>charlie, xsl:foo, $bar, xsl, $bar:fred</source>
          <textFromParse>charlie, xsl:foo, $bar, xsl,
          <modified>charlie, charlie:foo, $bar, xsl, $delta:fred</modified>
          <source>map{'a': 1 to 5, $b : delta:X}</source>
          <textFromParse>map{'a': 1 to 5, $b: delta:X}</textFromParse>
          <modified>map{'a': 1 to 5, $b: bar:X}</modified>

(I've sent you the appropriate files privately, so you can run them yourself and look at the parse trees - works fine in Oxygen)

*John Lumley* MA PhD CEng FIEE
john(_at_)saxonica(_dot_)com <mailto:john(_at_)saxonica(_dot_)com>
on behalf of Saxonica Ltd
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
<Prev in Thread] Current Thread [Next in Thread>