xsl-list
[Top] [All Lists]

Re: [xsl] Building and re-using an index gradually as multiple inter-related files get transformed

2011-05-09 10:37:51
This was the simplification indeed.
Instead of XML documents, I call a REST webservice (not mine own) with
the EXPath HTTP client.
The workflow is:
- I send a GET request to get a list of objects of one type
- I modify the XML response payload to remove identifiers (and modify
some values)
- I send a PUT request with the modified payload
- The webservice responds with a new XML payload containing the
submitted objects with their new identifiers  (which are GUIDs
assigned randomly, ie. they cannot be "guessed" from the properties of
the object.

The mapping index is therefore created by matching the first response
to the second one and extracting the identifiers from both.

Ideally, I would like to avoid using anything but XSLT to solve this,
if possible.

Fabre Lambeau


On 9 May 2011 16:21, Michael Kay <mike(_at_)saxonica(_dot_)com> wrote:
You haven't said how the new identifiers are generated (where do 434 and
2526 come from?).

The functional solution to this is to recognize that there is a function
f(oldID) -> newID that translates old identifiers to new identifiers. You
just need to call this function every time you want to do the translation
(not just the first time), and ensure of course that the function always
returns the same newID when given the same oldID.

Now, how do you implement this function efficiently? I can't tell you,
because you haven't told us anything about it.

Michael Kay
Saxonica


On 09/05/2011 15:35, Fabre Lambeau wrote:

Hi!
I'm after advice in how to build an "indexing" solution using XSLT 2.0.

Here is my use case (simplified a bit).
I have a number of XML files to "translate"/"re-map" into a second set
of XML files. For each input file, there will be a single output file
(1-to-1 relationship).
Each document lists a series of objects and their properties. This
"translation" consists of changing the identifier (GUID) of each
object in the source file.
However, some of the documents list objects that reference other
objects (dependencies). Whilst "translating" therefore, I need to keep
an index/dictionary of the old-vs-new identifiers, so that all
dependencies remain valid in the new set of files, but that there is
no overlap between original and new identifiers for any object.

Example (simplified, assume an XML representation)

SOURCE FILES
Fruits.xml
  Name=Apple, ID=1
  Name=Orange, ID=2
People.xml
  Name=Bob, ID=A
  Name=Marie, ID=B
Preferences.xml
  ID=Y, PersonID=A, FruitID=1
  ID=Z, PersonID=B, FruitID=1

TARGET FILES
Fruits.xml
  Name=Apple, ID=R
  Name=Orange, ID=T
People.xml
  Name=Bob, ID=434
  Name=Marie, ID=2526
Preferences.xml
  ID=G67, PersonID=434, FruitID=R
  ID=E43, PersonID=2526, FruitID=R

The example is obviously far more complex, with dozens of files and
complex dependencies. I know however the object model, and therefore
what objects have dependencies, and the direction of all dependencies.
I can therefore order the file transformation so as to ensure that no
file is processed if all its dependent objects have not already been
translated. BTW, I have no control over the identifiers themselves
(they are generated by a separate system).

I could obviously process each transformation one at a time, and every
time load the relevant source and target files already processed to
create the mapping index. However, I'm after a way to do this in one
single transformation.
The reason I'm stuck (mentally) is the following:
- Using XLST 2.0, I could use xslt:result-document to create the
target files. However, I believe I would not be able to load them in
the same transformation again (in order to do a lookup in them as
necessary when treating depencies)
- A variable, once defined, cannot be modified. I would therefore not
be able to create a global "index" of sort and keep adding to it as I
would in a procedural language.

What would be the best way to go about this?  A recursive template
that after each step passes the index generated at the previous step
and augments it?  Would I not run into performance problems when
treating hundreds of large source files?

--
Fabre Lambeau

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or 
e-mail:<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--




--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--





-- 
Fabre Lambeau

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--