xsl-list
[Top] [All Lists]

RE: Processing Efficiently

2005-06-10 06:25:02
I haven't looked at this in detail, but I think you can almost certainly
solve your performance problems using keys. Look for constructs like
//thing[property=value] and replace them with calls on the key() function.

Michael Kay
http://www.saxonica.com/ 

-----Original Message-----
From: Karl Stubsjoen [mailto:kstubs(_at_)gmail(_dot_)com] 
Sent: 08 June 2005 20:34
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: Re: [xsl] Processing Efficiently

I had to all ready reduce the size of the XML quite a bit by sheer
element renaming and elination of unused elements.  $s use to be 25MB,
but by eliminating unused elements (really needed 2) and by renaming
"xlsRow" to "R" and "xlsColumn" to "C" and by renaming the attribute
"column" to "c" I was able to reduce the size by 1/3.

The thing is this:  $s is my master doc, contains the lookup records. 
I have many individual docs that will be compared agains $s, and these
files range in size from 20KB to 5MB (appx.).  I don't mind a
different approach (for example reducing $s source).  I'm just curious
how others would approach something like this.  How would you arrange
such documentation for this sort of processing?

The scenario is:
Large data file for lookups / validation (10 to 20MB)
Individual data files (up to 5MB) 
As individual data files refresh, identify those items that exist in
the master list.  Again, this is a topic of "Performance" and "Best
Practice" for peforming frequent validations of documents this size.



On 6/8/05, tomas(_dot_)vanek(_at_)accenture(_dot_)com 
<tomas(_dot_)vanek(_at_)accenture(_dot_)com> wrote:
using keys could help to speed up the transformation (here 
is just the
idea):

...
       <xsl:key name="summaryInvoice"
use="document('summary.xml')//xls/R" match="C[(_at_)c='I']"/>

...
       <xsl:template match="xlsRow">
               <xsl:variable name="current_invoice"
select="xlsColumn[(_at_)column='Invoice_#']"/>
               <xsl:variable name="current_balance"
select="key('summaryInvoice', $current_invoice)/C[(_at_)c='B']"/>
               <xsl:variable name="diff_balance"
select="$current_balance - xlsColumn[(_at_)column='Balance']"/>
...

tomi


-----Original Message-----
From: Karl Stubsjoen [mailto:kstubs(_at_)gmail(_dot_)com]
Sent: Wednesday, June 08, 2005 10:08 AM
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: [xsl] Processing Efficiently

Hello,
I would like to optimize the following:

Where $s is a 5MB document and the source document is app 2-5MB.
The goal:  copy everything in the source that exists in $s.
Catch:  need to know the value of the balance in $s.

$s looks like:
<xls>
<R row="2">
 <C c="I">2AA9379</C><!-- match value "invoice" -->
 <C c="B">-127.5</C><!-- this is the balance --> </R> ...
</xls>

<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
<xsl:output method="xml" indent="yes" encoding="utf-8"/>

<xsl:variable name="s"
select="document('summarydata/summaryreduced.xml')//xls/R"/>

<xsl:template match="/">
<result>
<xsl:apply-templates
select="xls/xlsRow[xlsColumn[(_at_)column='Invoice_#']=$s/C[(_at_)c='I'] |
xlsColumn[(_at_)column='Balance'][not(.= $s/C[(_at_)c='B'])]]"/> </result>
</xsl:template>

<xsl:template match="xlsRow">
<xsl:variable name="current_invoice"
select="xlsColumn[(_at_)column='Invoice_#']"/>
<xsl:variable name="current_balance"
select="$s[C[(_at_)c='I']=$current_invoice]/C[(_at_)c'B']"/>
<xsl:variable name="diff_balance" select="$current_balance -
xlsColumn[(_at_)column='Balance']"/> <xsl:copy> <xsl:apply-templates
select="@*"/> <xsl:attribute name="current_balance"><xsl:value-of
select="$current_balance"/></xsl:attribute>
<xsl:attribute name="diff_balance"><xsl:value-of
select="$diff_balance"/></xsl:attribute>
 <xsl:apply-templates select="xlsColumn"/> </xsl:copy> 
</xsl:template>

<xsl:template match="@*">
<xsl:copy>
 <xsl:apply-templates select="@*"/>
</xsl:copy>
</xsl:template>

<xsl:template match="xlsColumn">
<xsl:copy-of select="."/>
</xsl:template>

</xsl:stylesheet>


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--



This message is for the designated recipient only and may 
contain privileged, proprietary, or otherwise private 
information.  If you have received it in error, please notify 
the sender immediately and delete the original.  Any other 
use of the email by you is prohibited.


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--



--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--





--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--