xsl-list
[Top] [All Lists]

Re: [xsl] Find/replace algorithm

2021-03-24 18:47:57
My instinct would be 

(a) build a map containing the replacements

(b) for each text node, tokenize the content, then scan the tokens looking each 
one up in the map.

The big advantage of this approach is that the cost is constant regardless how 
many substitutions there are, whereas most other approaches have a cost that 
increases linearly with the number of substitutions.

Michael Kay
Saxonica

On 24 Mar 2021, at 20:28, rick(_at_)rickquatro(_dot_)com 
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:

Hello All,
 
I have a fairly large XML file similar to this:
 
<?xml version="1.0" encoding="UTF-8"?>
<products>
    <product>ACME Wid Assbly</product>
    <product>Ford Eng Rebuild Kit</product>
</products>
 
I want to do an identity transform except that I want to do some find and 
replace on some of the words. For example
 
Wid = Widget
Assbly = Assembly
Eng = Engine
 
I am thinking of creating a lookup XML file to drive the find/replace actions:
 
<?xml version="1.0" encoding="UTF-8"?>
<lookup>
    <entry find="\bWid\b" replace="Widget"/>
    <entry find="\bAssbly\b" replace="Assembly"/>
    <entry find="\bEng\b" replace="Engine"/>
</lookup>
 
I am having trouble figuring out a good XSLT 2 or 3 algorithm for actually 
doing the replacements. Any suggestions or pointers would be appreciated. 
Thank you very much.
 
Rick
 
Rick Quatro
Carmen Publishing Inc.
585-729-6746
rick(_at_)frameexpert(_dot_)com <mailto:rick(_at_)frameexpert(_dot_)com>
http://www.frameexpert.com/store <http://www.frameexpert.com/store>
 
 
XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list>
EasyUnsubscribe <http://lists.mulberrytech.com/unsub/xsl-list/293509> (by 
email <>)
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--
<Prev in Thread] Current Thread [Next in Thread>