Yes that's the plan - I've been told there are four variations (so
far) on format, so test each string against each variation and mark it
up if one matches - any that drop out the bottom can be done by hand
(by someone else!) or if that list gets too large try and infer some
more rules. If they are based on based on a known format then it
should be simple enough to work backwards.
If you decide to do it yourself, there are some resources at Inera that might
provide some insight:
http://www.inera.com/resources.shtml
especially "Automated Quality Assurance for Heuristic-Based XML Creation
Systems" and "E-Journal Archive DTD Feasibility Study, Section 5.6.
References". They'll give you an idea of some of the subtleties involved, and
why it's probably not as simple as it seems.
Mike Waters
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--