xsl-list
[Top] [All Lists]

Re: [xsl] XSLT2, collection(), and xsl:key

2008-02-01 10:38:39
Create a variable that contains the element counts for each document,
something like:

<xsl:variable name="foo">
  <xsl:for-each select="collection()">
    <doc name="{document-uri()}">
      <xsl:for-each-group select="//*" group-by="name()">
        <elem name="{current-grouping-key()}"
count="{count(current-group())}"/>
      </xsl:for-each-group>
    </doc>
  </
</

That will give you:

<doc name="doc1.xml">
  <elem name="foo" count="20"/>
  <elem name="bar" count="44"/>
</doc>
<doc name="doc2.xml">
  <elem name="baz" count="1"/>
   ...


Then just use grouping again to generate the report.

cheers
andrew


On 01/02/2008, James Cummings <cummings(_dot_)james(_at_)gmail(_dot_)com> wrote:
Hiya,

I'm using the collection() function and Saxon to produce some
statistics about how many of which elements of which type in a
particular set of documents.

Let's say that document one has something like:

<p xml:id="doc1" type="hypothetical">
There is some text with <seg type="foo">some foo</seg> and
occasionally <seg type="blort">blort</seg> and <other
type="wibble">wibble</other></p>


and document two (and up to some really large number) is like:

<p xml:id="doc2">
There is another doc with <seg type="foo">some foo</seg> and
occasionally <seg type="notBlort">notBlort</seg> and <other
type="fluffy">fluffy other</other> and <some
  name="thing">someThing</some></p>

What I want to produce are tables of counts of specific elements, by
document and type. So something like the following (though using
table/row/cell xml markup):


table: other
document | fluffy | wibble | stuff
doc1 | 0 | 1 | 0
doc2 | 1 | 0 | 0
doc3 | 20 | 12 | 54

table: seg
document | blort | foo | notBlort
doc1 | 1 | 1 | 0
doc2 | 0 | 1| 1
doc3 | 23 | 44 | 58

table: some
document | thing | else | now
doc1 | 0 | 0 | 0
doc2 | 1 | 0 | 0
doc3 | 12 | 5 | 24

I can build this manually (and for one element I have done so) by doing:

<xsl:variable name="docs" select="collection('../../working/xml/docs.xml')"/>
<xsl:template name="main">
<table><head>seg by type</head>
<row rend="label">
<cell>document</cell>
<cell>blort</cell>
<cell>foo</cell>
<cell>notBlort</cell>
</row>
<xsl:for-each select="$docs//p"> <!-- let's pretend p is the root element -->
<row>
<xsl:variable name="doc" select="@xml:id"/>
<cell><xsl:value-of select="$doc"/></cell>
<cell><xsl:value-of select="count(.//seg[(_at_)type='blort'])</cell>
<cell><xsl:value-of select="count(.//seg[(_at_)type='foo'])</cell>
<cell><xsl:value-of select="count(.//seg[(_at_)type='notBlort'])</cell>
</row>
</xsl:for-each>
</table>
</xsl:template>

But that isn't really the point now is it?  I tried to use <xsl:key>
but I ran into the problem of it not liking the collection() function
as part of the match.

What I want to do is be able to say for-each doc, build me a table of
all the (let's pretend unknown) values of this attribute on this
element.  So something like:

<xsl:for-each select="$docs//p">
<xsl:value-of select="my:function(other/@type, seg/@type, thing/@name,
new/@type)"/>
</xsl:for-each>

and without knowing the values of @type in advance it makes a table
like above of them (using distinct-values()?) and counting their
occurrences.

This is a case where I know it must be possible, and I could just go
and do it manually, (in reality there are about 10 elements with a
number of attributes, with around 20 values each), but it just seems
*wrong* to do it that way. ;-)

Suggestions?

Thanks,

-James

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--




-- 
Andrew Welch
http://andrewjwelch.com
Kernow: http://kernowforsaxon.sf.net/

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--

<Prev in Thread] Current Thread [Next in Thread>