xsl-list
[Top] [All Lists]

[xsl] Matching string values across element boundaries

2013-04-08 13:13:47
I expect this has been discussed here before, but I can't locate any relevant 
discussion, so here goes.

We have input data with many unmarked short-title citations that look like this:

   Sprague, <hi rend="italic">Braintree Families</hi>

We want to wrap them inside another element, in our case a <ref> to the 
bibliographic expansion. We have a venerable chain of XSLT 2.0 transforms that 
does this, and pretty well, by preprocessing the data to convert all those <hi> 
tags into a pair of unique ASCII characters, so that we can do string-matching 
operations within a single text node that now includes something like

   Sprague, ¢Braintree Families¥

which is easy to handle with xsl:analyze-string. then once we've wrapped all 
the 
strings we need to, we post-process with xsl:analyze-string to put the <hi> 
elements back in.

In practice, given the proper regexes, this works quite well and provides the 
desired output, but I always feel a bit guilty about the hackishness of the 
approach. Given that the citations are quite variable in structure (usually but 
not always containing <hi> elements, with various combinations of text nodes at 
start and end), I've never come up with a good general-purpose way to operate 
purely on elements and text nodes without the convert-tags-to-characters step. 
Is there one (or more)?

David S.

-- 
David Sewell, Editorial and Technical Manager
ROTUNDA, The University of Virginia Press
PO Box 400314, Charlottesville, VA 22904-4314 USA
Email: dsewell(_at_)virginia(_dot_)edu   Tel: +1 434 924 9973
Web: http://rotunda.upress.virginia.edu/



--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--


<Prev in Thread] Current Thread [Next in Thread>