At 2007-10-25 07:03 +0100, Arthur Maloney wrote:
I'm trying to detect which row elements contain corrupt data.
Once I've detected the row is corrupt I'm OK printing it.
Depending on user choice. The Xml file contains 500-50,000 row elements
In the Xml file each row element contains between 2-15 agent elements
(there is always more than 1). The agent name should be the same in
all agent elements.
XSLT allows for easy comparison of the members of node sets, so in
this case, one can just check the set against itself. A node set
comparison is initialized to false, and the processor checks all
possible combinations of operands until either any one comparison is
true or the combinations are exhausted (which can be slow for large data sets).
But there is no way to instruct the processor to do the comparisons
one way or the other ... so when dealing with both operands as node
sets, one cannot ensure that the minimum number of checks is
done. For example, the following works:
<xsl:if test="agent != agent">corrupt</xsl:if>
... but there is a possibility the processor will happen to walk
through a combination of operands to produce a result in a very long
time. What if the processor first chose to compare each member of
the first operand against the corresponding member of the second
operand? The first loop through the entire set would be true for
every comparison. While that may not be likely, there is no control
from the stylesheet writer.
The equivalent result can be obtained with:
<xsl:if test="agent[1] != agent[ position()>1 ]">corrupt</xsl:if>
... where I am comparing one node against a set of nodes, and I think
this second way has no chance of a combination of operands "taking a
long time" to come up with a result.
So I used the second approach in the answer below, rather than the
obvious first answer above.
Example of Xml file
1. AppicantNumber is always unique for each row element
2. row 1 is not corrupt. All agent names the same
3. rows 2 and 3 are corrupt. Contain more than one agent name (row2
contains 4 names, row3 2 names).
I hope the explanation and answer below helps.
. . . . . . . . . Ken
t:\ftemp>type arthur.xml
<table>
...
<row>
<applicantNumber>56789</applicantNumber>
<agent>John1</agent>
<agent>John1</agent>
<agent>John1</agent>
<agent>John1</agent>
</row>
...
<row>
<applicantNumber>127789</applicantNumber>
<agent>John27</agent>
<agent>John1</agent>
<agent>Fred13</agent>
<agent>John27</agent>
<agent>John27</agent>
<agent>John27</agent>
<agent>Paul8</agent>
<agent>John27</agent>
</row>
...
<row>
<applicantNumber>16789345</applicantNumber>
<agent>Fred9</agent>
<agent>Fred9</agent>
<agent>Fred9</agent>
<agent>John1</agent>
<agent>Fred9</agent>
<agent>Fred9</agent>
<agent>Fred9</agent>
<agent>Fred9</agent>
<agent>Fred9</agent>
</row>
...
</table>
t:\ftemp>type arthur.xsl
<?xml version="1.0" encoding="US-ASCII"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:for-each select="table/row">
<xsl:value-of select="applicantNumber"/>: <xsl:text/>
<xsl:if test="agent[1] != agent[ position()>1 ]">corrupt</xsl:if>
<xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
t:\ftemp>xslt arthur.xml arthur.xsl con
56789:
127789: corrupt
16789345: corrupt
t:\ftemp>
--
Comprehensive in-depth XSLT2/XSL-FO1.1 classes: Austin TX,Jan-2008
World-wide corporate, govt. & user group XML, XSL and UBL training
RSS feeds: publicly-available developer resources and training
G. Ken Holman mailto:gkholman(_at_)CraneSoftwrights(_dot_)com
Crane Softwrights Ltd. http://www.CraneSoftwrights.com/s/
Box 266, Kars, Ontario CANADA K0A-2E0 +1(613)489-0999 (F:-0995)
Male Cancer Awareness Jul'07 http://www.CraneSoftwrights.com/s/bc
Legal business disclaimers: http://www.CraneSoftwrights.com/legal
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--