I'm sure this is easy to do in XSLT2 but I've just not got my head
wrapped around how to compare things properly in an efficient manner.
Let's say I have a wordlist where automatically generated from another
file I've got instances of how each word was used. In many cases
these are identical in spelling, and what I want to do is merge them
and store links between the original file and the wordlist in a
stand-off markup method.
Say the file has entries for each word which are like:
=====
<entry xml:id="let22-w27">
<form>
<orth type="hw">the</orth>
<form type="orthVar">
<orth xml:id="w72">The</orth>
<orth xml:id="w3955">The</orth>
<orth xml:id="w4513">The</orth>
<orth xml:id="w4578">The</orth>
<orth xml:id="w4650">The</orth>
<orth xml:id="w4672">The</orth>
<orth xml:id="w4703">The</orth>
<orth xml:id="w4824">The</orth>
<orth xml:id="w4830">The</orth>
<orth xml:id="w2045">the</orth>
<orth xml:id="w2079">the</orth>
<orth xml:id="w2101">the</orth>
<orth xml:id="w2112">the</orth>
<orth xml:id="w2333">the</orth>
<orth xml:id="w2400">the</orth>
<orth xml:id="w2442">the</orth>
<orth xml:id="w1402">T<ex>h</ex><hi rend="sup">e</hi></orth>
<orth xml:id="w2422">T<ex>h</ex><hi rend="sup">e</hi></orth>
<orth xml:id="w6458">T<ex>h</ex><hi rend="sup">e</hi></orth>
<orth xml:id="w7822">T<ex>h</ex><hi rend="sup">e</hi></orth>
<orth xml:id="w2097">t<ex>h</ex><hi rend="sup">e</hi></orth>
<orth xml:id="w2155">t<ex>h</ex><hi rend="sup">e</hi></orth>
<orth xml:id="w2482">t<ex>h</ex><hi rend="sup">e</hi></orth>
<orth xml:id="w5887">t<ex>h</ex><hi rend="sup">e</hi></orth>
<orth xml:id="w5642">T<ex>h</ex>e</orth>
<orth xml:id="w5378">t<ex>h</ex>e</orth>
</form>
</form>
</entry>
=====
What I want to end up with is for each form[(_at_)type='orthVar'] only
distinct-values for the orth elements therein with new @xml:id values,
and the old ones preserved at the bottom of the file linking new
values with the current ones (which are copies from a different file).
So something like:
=====
<div>
<entry xml:id="let22-w27">
<form>
<orth type="hw">the</orth>
<form type="orthVar" n="6"> <!-- n= num of diff variants-->
<orth xml:id="let22-w27-vA">The</orth>
<orth xml:id="let22-w27-vB">the</orth>
<orth xml:id="let22-w27-vC">T<ex>h</ex><hi rend="sup">e</hi></orth>
<orth xml:id="let22-w27-vD">t<ex>h</ex><hi rend="sup">e</hi></orth>
<orth xml:id="let22-w27-vE">T<ex>h</ex>e</orth>
<orth xml:id="let22-w27-vF">t<ex>h</ex>e</orth>
</form>
</form>
</entry>
<!-- more entries -->
<!-- at bottom of file -->
<div type="links">
<linkGrp xml:id="let22-w27-lg">
<!-- links between the orth form above with its instance in file.xml -->
<link targets="#let22-w27-vA file.xml#w72 file.xml#w3955
file.xml#w4513 file.xml#w4578 file.xml#w4650 file.xml#w4672
file.xml#w4703 file.xml#w4824 file.xml#w4830"/>
<link targets="#let22-w27-vB file.xml#w2045 file.xml#w2079
file.xml#w2101 file.xml#w2112 file.xml#w2333 file.xml#w2400
file.xml#w2442"/>
<link targets="#let22-w27-vC file.xml#w1402 file.xml#w2422
file.xml#w6458 file.xml#w7822 "/>
<link targets="#let22-w27-vD file.xml#w2097 file.xml#w2155
file.xml#w2482 file.xml#w5887"/>
<link targets="#let22-w27-vE file.xml#w5642"/>
<link targets="#let22-w27-vF file.xml#w5378"/>
</linkGrp>
<!-- more linkGrps -->
</div>
</div>
======
XSLT2 is certainly usable in this case, but all of my attempts have
been hideously inefficient, or fail to accurately compare the nested
children properly.
Suggestions?
Thanks,
-James
--
James Cummings, Cummings dot James at GMail dot com
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--