xsl-list
[Top] [All Lists]

Re: [xsl] Building and re-using an index gradually as multiple inter-related files get transformed

2011-05-09 10:21:44
You haven't said how the new identifiers are generated (where do 434 and 2526 come from?).

The functional solution to this is to recognize that there is a function f(oldID) -> newID that translates old identifiers to new identifiers. You just need to call this function every time you want to do the translation (not just the first time), and ensure of course that the function always returns the same newID when given the same oldID.

Now, how do you implement this function efficiently? I can't tell you, because you haven't told us anything about it.

Michael Kay
Saxonica


On 09/05/2011 15:35, Fabre Lambeau wrote:
Hi!
I'm after advice in how to build an "indexing" solution using XSLT 2.0.

Here is my use case (simplified a bit).
I have a number of XML files to "translate"/"re-map" into a second set
of XML files. For each input file, there will be a single output file
(1-to-1 relationship).
Each document lists a series of objects and their properties. This
"translation" consists of changing the identifier (GUID) of each
object in the source file.
However, some of the documents list objects that reference other
objects (dependencies). Whilst "translating" therefore, I need to keep
an index/dictionary of the old-vs-new identifiers, so that all
dependencies remain valid in the new set of files, but that there is
no overlap between original and new identifiers for any object.

Example (simplified, assume an XML representation)

SOURCE FILES
Fruits.xml
   Name=Apple, ID=1
   Name=Orange, ID=2
People.xml
   Name=Bob, ID=A
   Name=Marie, ID=B
Preferences.xml
   ID=Y, PersonID=A, FruitID=1
   ID=Z, PersonID=B, FruitID=1

TARGET FILES
Fruits.xml
   Name=Apple, ID=R
   Name=Orange, ID=T
People.xml
   Name=Bob, ID=434
   Name=Marie, ID=2526
Preferences.xml
   ID=G67, PersonID=434, FruitID=R
   ID=E43, PersonID=2526, FruitID=R

The example is obviously far more complex, with dozens of files and
complex dependencies. I know however the object model, and therefore
what objects have dependencies, and the direction of all dependencies.
I can therefore order the file transformation so as to ensure that no
file is processed if all its dependent objects have not already been
translated. BTW, I have no control over the identifiers themselves
(they are generated by a separate system).

I could obviously process each transformation one at a time, and every
time load the relevant source and target files already processed to
create the mapping index. However, I'm after a way to do this in one
single transformation.
The reason I'm stuck (mentally) is the following:
- Using XLST 2.0, I could use xslt:result-document to create the
target files. However, I believe I would not be able to load them in
the same transformation again (in order to do a lookup in them as
necessary when treating depencies)
- A variable, once defined, cannot be modified. I would therefore not
be able to create a global "index" of sort and keep adding to it as I
would in a procedural language.

What would be the best way to go about this?  A recursive template
that after each step passes the index generated at the previous step
and augments it?  Would I not run into performance problems when
treating hundreds of large source files?

--
Fabre Lambeau

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail:<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--




--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--