[Top] [All Lists]

Re: Why the 822bis grammar is so painful

1999-02-06 23:08:45
Pete Resnick, over the objections of several implementors on DRUMS,
threw away the RFC 822 tokenizer. 

For better or worse, the RFC 2234 BNF was designed to be usable in 
other protocol specifications besides those describing email headers. 
Many non-email specifications use 822's ABNF.  Most of those that
do use 822's ABNF have ambiguities or errors, because few of these
protocols are intended to allow the same liberal use of white space 
and comments allowed by RFC 822. 

Indeed, DRUMS decided that 822 was too liberal in allowing white
space and comments, and that the revised mail specification needed
to discourage certain such uses, and therefore the grammar needed to be
able to explicitly specify where white space and comments were legal.
And since 822's "#" notation implicitly allows white space and
comments, that had to go also.

One might say that both 822's ABNF and 2234's ABNF share the 
same deficiency - the failure to separate lexical analysis
from parsing.  (to be fair, 822 does have a prose description
of its lexical analysis step, but most of the specifications 
that borrowed 822's ABNF notation did not describe their own 
lexical analysis).

And while 2234's ABNF has been used successfully to describe
other protocols, I don't like the result for 822bis.  I agree
that it makes the grammar unnecessarily difficult to read
or verify for correctness, and there are bound to be a few
bugs left.  (and I'm not looking forward to doing a detailed 
review for IESG)

However, Dan's mistaken to blame Pete for this.  A large number 
of DRUMS participants - probably a rough consensus - argued for 
the changes to the grammar notation, and to deprecate 822's
liberal acceptance of comments and white space.  This opinion
held despite the objections of some members with more experience 
in language design that this would make the grammar too complicated 
and introduce errors.