At 12:10 PM 10/20/2006, Mike Waters wrote:
CrossRef, a scholarly linking service, offers a "Simple Text Query" form at:
http://www.crossref.org/freeTextQuery/
which is an implementation of a commercial tool by Inera, who is at
the cutting edge of processing unstructured citations into XML.
Cutting and pasting your example into the form doesn't return a
marked-up version, but it does return a DOI link (the point of the
form). Underneath it though, the tool is doing it's magic; they've
been at it for years, so have worked out just about everything
regarding this area, including the handling of various citation
styles and the use of "fuzzy" matching.
It might be worth a look if your need is for a large volume of
processing, and you can't devote time to coming up with your own solution.
Inera's eXtyles is an excellent product. It uses heavy-duty
heuristics and pattern-matching smarts, and it benefits from having
taken a good long look at a wide range of real-world input data and
from having gone through many iterations.
The fact that a commercial venture can succeed in the marketplace
with a proprietary toolkit to do citation upconversion (and other
editorial cleanup on the way to upconversion) is an indication of its
difficulty, I think.
Cheers,
Wendell
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--