Vyacheslav Sedov wrote:
xslt-based web-crawler - great idea
what about distributed search system? since eXist contain Lucene - it
can be easy implemented
most pages in WWW not well-formed XML? Tidy can help - is here
possibility to implement tidy as buillt-in function to XSLT?
Try tagsoup from John Cowan.
It integrates with saxon to give direct xml access to bad html.
regards
--
Dave Pawson
XSLT XSL-FO FAQ.
http://www.dpawson.co.uk
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--