xsl-list
[Top] [All Lists]

Re: [xsl] NFA to DFA conversion using XSLT

2007-06-01 10:59:17
Thanks Dimitre for your insightful remarks.

On 6/1/07, Dimitre Novatchev <dnovatchev(_at_)gmail(_dot_)com> wrote:
 I am
> following some examples and algorithms from the book, "Principles of
> compiler design - by Aho, Ullman).
>
> One of the algorithms for NFA to DFA conversion states, that we
> essentially follow the below two steps:
> 1) Compute e-CLOSURE (e stands for epsilon transitions) - which
> involves pushing & popping states from a stack.
> 2) The subset construction - which uses e-CLOSURE function to construct a DFA.
>
> The book also describes an algorithm to minimize the number of states
> of a DFA (whose output is a DFA accepting the same language as
> original DFA, but having as few states as possible).
>
> Do you think, we could do this with XSLT? As I defined the NFA above,
> I can specify the meaning of a DFA similarly.
>
> I recently read on Dimitre's blog, that he has written a framework for
> LR-Parsing using XSLT 2.0. My interest is along the same lines..
>

We can do everything in XSLT.

However, a system's developer may decide how to allocate the available
resources in an optimal way -- this most probably will exclude the
writing in XSLT of parser generators or lexical scanner generators.

What I did implement recently in FXSL is a general table-driven LR
parser. This is small, compact, easy to develop and quite efficient.

However, I decided *not* to develop in XSLT another YACC-like tool,
which given an
LR (1) grammar produces the tables to be used by the parser. Taking
into account that suitable tools exist (like YACC and/or BYSON), it
would be an enormous waste of time for me to write this in XSLT.

Instead, I just "touched" an available open source version of YACC, so
that it now generates the parsing tables in XML format. These are then
used by the table-driven LR parser, which (only) is written in XSLT.
The modified YACC is called YACCX and is available in the Tools folder
of the CVS of the project.

Another reason for my decision is the fact that parsing tables are
produced only once and then they are used many times. As the
production of the parsing tables will be done very rarely (once for a
given language), we can do this offline with whatever tool we already
have available.

As for the lexical analyzer, one could try the same approach and
modify an appropriate tool, such as lex, to produce tables in XML. I
didn't do this on purpose, as I am now having fun learning to use
RegEx expressions and their support in XSLT 2.0. Doing so helps better
understand some intrinsic lexical properties of a given language, and
is therefore quite beneficial.




--
Cheers,
Dimitre Novatchev
---------------------------------------
Truly great madness cannot be achieved without significant intelligence.
---------------------------------------
To invent, you need a good imagination and a pile of junk
-------------------------------------
You've achieved success in your field when you don't know whether what
you're doing is work or play



On 5/31/07, Mukul Gandhi <gandhi(_dot_)mukul(_at_)gmail(_dot_)com> wrote:
> Hi Mike,
>   Thanks for your reply.
>
> I am toying with the following idea (a sample state table) to
> represent state table of NFA:
>
> <nfastates>
>  <state name="0">
>    <input symbol="a">
>      <movetostate name="0" />
>      <movetostate name="1" />
>    </input>
>    <input symbol="b">
>      <movetostate name="0" />
>    </input>
>  </state>
>  <state name="1">
>    <input symbol="a">
>      <movetostate name="none" />
>    </input>
>    <input symbol="b">
>      <movetostate name="2" />
>    </input>
>  </state>
>  <state name="2">
>    <input symbol="a">
>      <movetostate name="none" />
>    </input>
>    <input symbol="b">
>      <movetostate name="3" />
>    </input>
>  </state>
>  <state name="3" isfinal="yes" />
> </nfastates>
>
> (this is the NFA recognizing the language (a | b)*abb ). I am
> following some examples and algorithms from the book, "Principles of
> compiler design - by Aho, Ullman).
>
> One of the algorithms for NFA to DFA conversion states, that we
> essentially follow the below two steps:
> 1) Compute e-CLOSURE (e stands for epsilon transitions) - which
> involves pushing & popping states from a stack.
> 2) The subset construction - which uses e-CLOSURE function to construct a DFA.
>
> The book also describes an algorithm to minimize the number of states
> of a DFA (whose output is a DFA accepting the same language as
> original DFA, but having as few states as possible).
>
> Do you think, we could do this with XSLT? As I defined the NFA above,
> I can specify the meaning of a DFA similarly.
>
> I recently read on Dimitre's blog, that he has written a framework for
> LR-Parsing using XSLT 2.0. My interest is along the same lines..
>
> On 5/31/07, Michael Kay <mike(_at_)saxonica(_dot_)com> wrote:
> > > I am looking for a suitable W3C Schema to represent state
> > > table of a NFA.
> >
> > I think there are such things in the context of workflow modelling (e.g.
> > BPEL) but it might carry a lot more baggage than you actually want.
> > >
> > > My second question is: is it easily possible to write a XSLT
> > > program (I can prefer 2.0 language if you wish) to convert a
> > > NFA to an equivalent (DFA, with minimal states).
> > >
> > Possible yes. Easy, I doubt it. I found it challenging enough in Java. But I
> > expect someone will prove me wrong!
> >
> > Michael Kay
> > http://www.saxonica.com/

--
Regards,
Mukul Gandhi

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--

<Prev in Thread] Current Thread [Next in Thread>