xsl-list
[Top] [All Lists]

Re: [xsl] Expensive XSLT2 - suggestions for improving?

2008-10-16 15:34:33
Wendell,

Thanks for your pointers and for reminding me that key() has a third attribute. It does not help here, because the structure is totally flat, the parent is the root element.

And "key('oid-by-value',.)[1] except ." is a nice piece of Boolean logic :-)

Thanks,

- Michael

Am 16.10.2008 um 18:34 schrieb Wendell Piez:

Michael,

This is an interesting problem, and you may want to try a few things.

Part of what makes it interesting is the question of how widely you wish to scope your examination for similar values. In XSLT 2, a third argument can be used to define the scope within which the key works.

You could try something like this:

<xsl:key name="oid-by-value" match="@oid" use="string(..)"/>
<!-- retrieves an @oid attribute using the string value of its parent element -->

and then

<xsl:template match="value">
 <xsl:copy>
   <xsl:apply-templates select="@*"/>
   <xsl:for-each select="key('oid-by-value',.)[1] except .">
     <!-- traverse to the @oid of the first element with the
          same value, unless this is it -->
     <xsl:attribute name="refoid" select="string()"/>
   </xsl:for-each>
   <!-- skip content -->
 </xsl:copy>
</xsl:template>

if you wanted to scope only within the parent element, you could use key('oid-by-value',.,..)[1] -- the '..' as the third argument restricts the scope of retrieval.

Note: untested. (But if it won't work, surely some sharp-eyed XSLTer will notice.)

Cheers,
Wendell


At 11:54 AM 10/16/2008, you wrote:
Hello experts,

The task is to remove duplicate text content before moving an XML file
into translation. After the translation, the former duplicate content
should be recreated.

Assume this input XML (I dropped a lot of attributes):

<Doc>
<value oid="40068">Lasttrennschalter</value>
<value oid="40069">Umbau von N12 auf N4</value>
<value oid="4006a">Lasttrennschalter</value>
</Doc>

The third <value> should be empty because its content is identical to
the first, but we need a pointer to that first element to be able to
recreate the content after translation. Also, all original attributes
must stay unchanged. Therefore in each duplicate I insert an extra
attribute @refoid with the @oid of the source element. So I get this:

<Doc>
<value oid="40068">Lasttrennschalter</value>
<value oid="40069">Umbau von N12 auf N4</value>
<value oid="4006a" refoid="40068"/>
</Doc>

My XSL is very simple and works as intended, but it does not scale
very good, I guess because I look at preceding::value so many times:

<!-- Condenser: modify all duplicates -->
<xsl:template match="value[.=preceding::value]">
 <xsl:copy>
   <xsl:apply-templates select="@*"/>
   <xsl:attribute name="refoid"
     select="preceding::value[.=current()][last()]/@oid"/>
   <!-- skip content -->
 </xsl:copy>
</xsl:template>

<!-- pass-through all nodes and attributes -->
<xsl:template match="@*|node()">
 <xsl:copy>
   <xsl:apply-templates select="@*|node()"/>
 </xsl:copy>
</xsl:template>

I guess a clever constructed key could help a lot... any pointers are
very welcome!


--
_______________________________________________________________
Michael Müller-Hillebrand: Dokumentations-Technologie
Adobe Certified Expert, FrameMaker
Lösungen und Training, FrameScript, XML/XSL, Unicode
<http://cap-studio.de/> -- Tel. +49 (9131) 28747




--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--

<Prev in Thread] Current Thread [Next in Thread>