On Sun, Jun 27, 2010 at 16:07, Andriy Gerasika
<andriy(_dot_)gerasika(_at_)gmail(_dot_)com> wrote:
For a language as rich as RTF, regular expressions are not going to get
you all that far: they are probably only suitable for writing the
lexical analyzer (or tokenizer).
RTF syntax is not that complex for requiring BNF parser.
assuming the following RTF:
{\rtf1\ansi{\fonttbl\f0\fswiss Helvetica;}\f0\pard
This is some {\b bold} text.\par
}
it can be easily converted w/ regular expressions to something like:
<g><rtf>1</rtf><ansi/><g><fonttbl/><f>0</f><fswiss/>Helvetica<sc/></g><f>0</f><pard/>
This is some <g><b/>bold</g> text.<par/>
</g>
where "g" equals to RTF's curly braces(group) and "sc" to semicolon in RTF.
not sure if BNF parser will produce something better...
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail:
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--
This seems about as useful as a regex C compiler, that compiles
main() { printf ("Hello world!\n"); }
and _nothing_ else.
Just because you can make an regex for _one instanace_ of a grammer
does not mean that you can (easily) use regexs to parse a generic
format. RTF is generic - there are MANY valid ways to say similiar
things in RTF.
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--