The XPath parser used in Saxon is a hand-written hybrid of a recursive-descent
parser and a precedence parser.
The very first version was actually derived from James Clark's xt parser, and
over the years it evolved out of all recognition, but without ever being
redesigned from scratch. I have no idea if throwing it away and starting again
with a generated parser would give any benefits. I suspec that having a
hand-written parser gives us more control over diagnostics and error recovery,
and it also enables us to support multiple grammars (different versions of
XPath and XQuery, plus XSLT patterns) within a single parsing framework.
XPath (and even more so XQuery) has a lot of ad-hoc rules for resolving
ambiguities, for example the rule that in the expression (/ or /*), "or" parses
as an element name, not as a binary operator (I don't think this ambiguity was
even discovered for many years after XPath 1.0 was published). Again, I think
it's probably easier to implement such ad-hoc rules with a hand-written parser.
But someone who understands a particular parser generator well could probably
find a way to do it.
On 9 May 2022, at 21:35, Roger L Costello costello(_at_)mitre(_dot_)org
Are XPath expressions parsed using
compiler parsing algorithms?
Michael Kay responded:
Yes, of course
Hi Michael, does that mean each time Saxon encounters a place in an XSLT
program where an XPath expression is expected, Saxon sends the expression
into an XPath parser which tokenizes the expression, parses it into a syntax
tree, and then traverses the tree to evaluate the expression? Did you use a
parser generator to auto-generate the parser? If yes, which parser generator
did you use? If you didn't use a parser generator, why not?
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com