xsl-list
[Top] [All Lists]

RE: Converting a hierarchical to a pseudo relational xml format -or - adding to the middle of a result tree

2003-03-26 09:22:56
Hi Hugh,
How about going about it this way...

In the match="/" template, first apply-templates to "//coord" elements;
then to "//line" elements; then to "//chain" elements.
Have one template each for these three kinds of elements.

In the match="coord" template, produce the <Coords><Coord ...>...</Coords>
output, using generate-id() for the num= attribute.
Oops, generate-id() only works with nodes
in the source document, and some coords will only have nodes in the result
tree.  I see what you were getting at.
Hmm.  Possible solution: have a two-step transformation.  The first step
transforms multi-coords (e.g. val="11 12 13 14 15 16") into a series
of single coords; then the resulting
data is simpler to process.  (For the two steps, either use two XSL stylesheets,
or use an extension node-set() function to put the result of the first
transformation into a form the second step can use.  But as you note below,
the intermediate result may be huge.)

Assuming the first step has been done, then in the second step, the
match="coord" template can use generate-id() for the num= attribute.
(Or it could use <xsl:number> if you want numbers -- which is slower
for large documents.  But this still
requires that we do the first step transformation first.)

Then the match="line" template can simply use the same technique,
generate-id() or <xsl:number>, to generate references to the
constituent coords of each line, and to generate num= attributes
for the lines.
Similarly for the match="chain" template.

Does that get you going?

More comments below...

-----Original Message-----
From: owner-xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
[mailto:owner-xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com]On Behalf Of 
Hugh Dixon
Sent: Saturday, March 22, 2003 12:39 AM
To: XSL-List(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: [xsl] Converting a hierarchical to a pseudo relational xml
format -or - adding to the middle of a result tree


I am wanting to convert this:

  <chain>
    <line>
      <coord val="1 2"/>
      <coord val="3 4"/>
    </line>
   <line>
      <coord val="3 4"/>
      <coord val="5 6"/>
      <coord val="7 8"/>
    </line>
  </chain>
  <coord val="9 10"/>
  <coord val="11 12 13 14 15 16"/>
  <line>
    <coord val="17 18"/>
    <coord val="19 20 21 22 23 24"/>
  </line>


Into this:

<Coords>
  <Coord num = "1" val ="1 2"/>
  <Coord num = "2" val ="3 4"/>
  <Coord num = "3" val ="5 6"/>
  <Coord num = "4" val ="7 8"/>
  <Coord num = "5" val ="9 10"/>
  <Coord num = "6" val ="11 12"/>
  <Coord num = "7" val ="13 14"/>
  <Coord num = "8" val ="15 16"/>
  <Coord num = "9" val ="17 18"/>
  <Coord num = "10" val ="19 20"/>
  <Coord num = "11" val ="21 22"/>
  <Coord num = "12" val ="23 24"/>
</Coords>
<Lines>
  <Line num="1" coordVals="1 2"/>
  <Line num="2" coordVals="2 3 4"/>
  <Line num="3" coordVals="9 10 11 12"/>
</Lines>
<Chains>
  <Chain num="1" lineVals="1 2"/>
</Chains>

Sorry it's so long......
The number in coord::@val can be any floating point value.  The @num
values in the output document, are integers, that are generated through
the style sheet.  They are effectively primary keys, which I would like
to keep small, although this is not essential.

As mentioned above, <xsl:number> will give you numbers, though it will
take longer to compute than generate-id() (which will give you strings
of typically about 7 characters) on large documents.

In the result each coord, line and chain is given a unique number, and a
list of values to the constituent elements (i.e. Chains to Lines, and
Lines to Coords)

To perform this transformation, I'd like to able to walk the source
tree, writing Coord, Line and Chain elements as required, also doing
counts to get my next number.  Is this possible?  I suspect not.

As you know, the "doing counts" part is problematic, if you mean
"incrementing variables."  But generate-id() or <xsl:number> can help here.

I think trying to do all this with a single walk of the
source tree makes it hard; but fortunately in XSLT you don't have
to tell it how to traverse the tree.  You just think in terms of
how each type of element in the tree should be treated.

My other idea was that I write all the Coords first (as I understand you
should with XSLT!), but leave an identifier(s) on each coord node as I
process it, from which I could retrieve the Coord:@num values when the
<Line> elements are written, and similarly for the next level up.
I cannot see how to do either of these easily.

Hopefully what I wrote above addresses this.  The primary difficulty,
I think, would be retrieving id's from the result tree.  That's why I
would find it easier to do it with a two-step transformation.

Using the first of the two ideas, I could write my output into a large
string/variable, with which I am not restricted to writing to the end.

I'm not sure I understand that last clause.

As the output files may end up being hundreds of megabytes in 
size, I am not sure this is really practical.

With a two-step process, the output of the first step goes into a file
instead of having to hold it in memory, which may help.
Though it may all end up in memory during the second step anyway...
:-/

Could someone please offer me an alternative solution?

Thanks,
Hugh Dixon

Please let me know if this helps.

Lars


Blind unbelief is sure to err,
And scan his work in vain;
God is his own interpreter,
And he will make it plain.
        - William Cowper


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list



<Prev in Thread] Current Thread [Next in Thread>