Surely no one writes that stuff by hand, didn't it always
start out life
marked up in a citation database like bibtex or endnote or something?
So if you can get hold of the original source life is much easier...
you would've thought so...
I'll ask around but I think the authors cut and paste the citations
from other locations and they were subsequently stored in a single
<div> or <p> - the creator of the cms was very short cited ;-)
If you think its not really feasible to parse a plain text citation
into a marked up version then that's good feedback - it could well be
that a percentage need to be done by hand.
CrossRef, a scholarly linking service, offers a "Simple Text Query" form at:
http://www.crossref.org/freeTextQuery/
which is an implementation of a commercial tool by Inera, who is at the cutting
edge of processing unstructured citations into XML.
Cutting and pasting your example into the form doesn't return a marked-up
version, but it does return a DOI link (the point of the form). Underneath it
though, the tool is doing it's magic; they've been at it for years, so have
worked out just about everything regarding this area, including the handling of
various citation styles and the use of "fuzzy" matching.
It might be worth a look if your need is for a large volume of processing, and
you can't devote time to coming up with your own solution.
Mike Waters
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--