xsl-list
[Top] [All Lists]

RE: [xsl] building a hierarchical classification out of flat and redundant data

2006-07-24 04:43:59
The key to this is to do a depth-first recursive traversal of the tree by
starting with the root, and for each node processing its children using
xsl:apply-templates, but with one key difference: you need to select the
logical root and the logical children, rather than the XML root and the XML
children.

I'm not sure how you identify the logical root in your structure. The
logical children of a node N are those nodes that have a child element equal
(in name and value) to every child element of N: that is in XSLT 2.0:

<xsl:template match="document">
  <xsl:variable name="this" select="."/>
  <xsl:apply-templates select="//document[
     every $c in $this/* satisfies (some $d in ./* satisfies deep-equals($c,
$d))]"/>


It's a bit more difficult in 1.0, but I hope you get the idea. 

Michael Kay
http://www.saxonica.com/  



-----Original Message-----
From: Georg Hohmann [mailto:georg(_dot_)hohmann(_at_)gmail(_dot_)com] 
Sent: 24 July 2006 11:43
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: [xsl] building a hierarchical classification out of 
flat and redundant data

Dear XSLT-Community,

i have problem with some "strange" type of data which i have 
to convert to a hierarchical xml structure.

My source is a huge xml file which represents a decimal 
classifikation. It contains so called documents, where each 
document represents one node of the classification. 
Furthermore each documents shows the direct parents of a 
node. It's a structure like this (example taken from 
http://www.udcc.org):
...
<document>
      <tag1>3</tag1>
      <tag1a>Social Sciences</tag1a>
</document>
<document>
      <tag1>3</tag1>
      <tag1a>Social Sciences</tag1a>
      <tag2>32</tag2>
      <tag2a>Politics</tag2a>
</document>
<document>
      <tag1>3</tag1>
      <tag1a>Social Sciences</tag1a>
      <tag2>32</tag2>
      <tag2a>Politics</tag2a>
      <tag3>326</tag3>
      <tag3a>Slavery</tag3a>
</document>
...
As you can see there is no hierarchical information in it 
instead of the names and the sequence of the tags. In my real 
data i have up to 9 levels, but not every time. My result 
should look like this (or something similar):
...
<node id="3" name="Social Science">
   <node id="32" name="Politics">
      <node id="326" name="Slavery"/>
   </node>
</node>
...
I have simply no idea what to start with to archive this 
result. I guess the first step would be to get rid of all 
those redundant content, but i don't know how. And i even 
can't figure out how to build the hierachichal structure the 
same time.

Has anyone a good starting point for this?

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--



--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--