Re: [xsl] text replacement with mixed content

This isn't a trivial task, so you may or may not get someone to give
you a working solution for free.....

One way to tackle this is to:

- tokenize the search string into individual words

- mark up those individual works in the document

- identify sequences of that markup

- replace the sequences with the replacement markup

Yes, it's definitely challenging. Reading the problem and Andrew'ssolution makes me realise that this is an example of the class ofproblems which Michael Jackson (of Jackson Structured Programming fame)calls "boundary clash" problems. In the markup field these tend to bedescribed as "overlap" problems. You have two hierarchies in thedocument - the element hierarchy and the sentence/word/characterhierarchy, and they overlap in the sense that the boundaries in onehierarchy don't coincide with those in the other. The technique, at avery high level of abstraction, is to rearrange the document into thehierarchy that you want to process, while retaining sufficientinformation to reconstitute the other hierarchy when you are done. Thisretained information can either be inline (perhaps in the form of"milestone" tags), or out-of-line (an index of pointers into the text).


Michael Kay
Saxonica

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--

<Prev in Thread]

Current Thread

[Next in Thread>

Previous by Date:

Re: [xsl] text replacement with mixed content, Geert Bormans

Next by Date:

Re: [xsl] text replacement with mixed content, Andrew Welch

Previous by Thread:

Re: [xsl] text replacement with mixed content, Geert Bormans

Next by Thread:

Re: [xsl] text replacement with mixed content, Andrew Welch

Indexes:

[Date] [Thread] [Top] [All Lists]