xsl-list
[Top] [All Lists]

Re: [xsl] How to design XPath queries and XSLT code that can be readily repurposed?

2019-08-03 02:46:13
Designing software with potential for change is pretty much what 90% of 
software engineering is all about. There will always be some changes in 
requirements that can be accommodated easily and some that can't.

I don't have time today to look at this specific example, but one of the best 
techniques for this in the XML world is pipelining: if you split the work into 
a series of transformations, each doing one small task, then it's very often 
possible to accommodate new requirements by adding extra steps to the pipeline, 
or by confining the changes to one of the steps.

Michael Kay
Saxonica

On 2 Aug 2019, at 19:08, Costello, Roger L. costello(_at_)mitre(_dot_)org 
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:

Hi Folks,

How to design XPath queries and XSLT code so that a complete rewrite isn't 
required every time there is a small change in requirements?  Stated another 
way, how to design XPath queries and XSLT code that can be readily 
repurposed? Allow me to explain what I mean with a concrete example, please 
...

I have three XML files:

1. routes.xml ... this file shows the routes that cars take to travel between 
two cities.
2. rest-stops.xml ... this file shows the rest stops on routes.
3. gas-stations.xml ... this file shows the location of gas stations. Some 
rest stops have gas stations, others don't.

My initial task was to display the gas stations at rest stops on the routes 
from Boston to NYC. I wrote some XPath queries and XSLT code. It worked - yea!

Then I was tasked to display, for each route, the gas stations at rest stops. 
For example, display the gas stations at the rest stops on route I-84, 
display the gas stations at the rest stops on route I-95, and so on.

I was shocked at the amount of re-coding that I had to do for the second 
task. After all, in the first task I had already figured out how to identify 
the rest stops along each route, which rest stops have gas stations, and so 
on. The second task should have been a simple matter of reusing the XPath 
query results from the first task to display the results in a slightly 
different way. But no, I had to do a complete rewrite.

And now I am getting tasked to display the data in still another way - show 
the data as triples (route, rest stop, gas station). Eek! I can already see 
that I will have to do a third rewrite.

Clearly my XPath queries and XSLT code are not suitable for repurposing. I 
remember reading long ago a book about this issue of writing code in a way 
that when a new requirement is introduced it doesn't require a complete 
rewrite. Wish I could remember what the book's solution to the problem is. 

How can I design my XPath queries and XSLT code in a way that they can be 
readily repurposed?

Here is routes.xml

<Routes>
   <row>
       <IDENT>I-84</IDENT>
       <Start-End>BOS-NYC</Start-End>
   </row>
   <row>
       <IDENT>I-95</IDENT>
       <Start-End>BOS-NYC</Start-End>
   </row>
   <row>
       <IDENT>RTE-1A</IDENT>
       <Start-End>Bangor-Miami</Start-End>
   </row>
</Routes>

The IDENT element is a primary key and also a foreign key into the rest stop 
file. The rest stops for a route with IDENT i are those rows in 
rest-stops.xml with the Road element equal to i.

Here is rest-stops.xml

<Rest-Stops>
   <row>
       <Name>Willington</Name>
       <Road>I-84</Road>
       <Direction>south</Direction>
       <Gas>BP-Willington-South</Gas>
   </row>
   <row>
       <Name>Willington</Name>
       <Road>I-84</Road>
       <Direction>north</Direction>
       <Gas>BP-Willington-North</Gas>
   </row>
   <row>
       <Name>Stormville</Name>
       <Road>I-84</Road>
       <Direction>north</Direction>
   </row>
   <row>
       <Name>Southington</Name>
       <Road>I-84</Road>
       <Direction>south</Direction>
       <Gas>Exxon-Southington</Gas>
   </row>
   <row>
       <Name>Branford</Name>
       <Road>I-95</Road>
       <Direction>south</Direction>
       <Gas>Mobil-Branford-South</Gas>
   </row>
   <row>
       <Name>Branford</Name>
       <Road>I-95</Road>
       <Direction>north</Direction>
       <Gas>Mobil-Branford-North</Gas>
   </row>
   <row>
       <Name>Darien</Name>
       <Road>I-95</Road>
       <Direction>north</Direction>
   </row>
</Rest-Stops>

The Gas element is a foreign key into the gas station file. The gas station 
at rest stop r is the row in gas-stations.xml with the IDENT element equal to 
r.

Here is gas-stations.xml

<Gas-Stations>
   <row>
       <IDENT>BP-Miami</IDENT>
       <Gas>Shell</Gas>
       <Location>Miami</Location>
   </row>
   <row>
       <IDENT>BP-Washington</IDENT>
       <Gas>BO</Gas>
       <Location>Washington</Location>
   </row>
   <row>
       <IDENT>BP-Willington-North</IDENT>
       <Gas>BP</Gas>
       <Location>Willington</Location>
   </row>
   <row>
       <IDENT>BP-Willington-South</IDENT>
       <Gas>BP</Gas>
       <Location>Willington</Location>
   </row>
   <row>
       <IDENT>Shell-Willington</IDENT>
       <Gas>Shell</Gas>
       <Location>Willington</Location>
   </row>
   <row>
       <IDENT>Exxon-Southington</IDENT>
       <Gas>Exxon</Gas>
       <Location>Southington</Location>
   </row>
   <row>
       <IDENT>Mobil-Branford-North</IDENT>
       <Gas>Mobil</Gas>
       <Location>Branford</Location>
   </row>
   <row>
       <IDENT>Mobil-Branford-South</IDENT>
       <Gas>Mobil</Gas>
       <Location>Branford</Location>
   </row>
</Gas-Stations>

Here are the results for the first task (show the gas stations at rest stops 
on the routes from Boston to NYC):

<Gas-stations-at-rest-stops-on-routes-from-BOS-to-NYC>
   <row>
       <IDENT>BP-Willington-South</IDENT>
       <Gas>BP</Gas>
       <Location>Willington</Location>
   </row>
   <row>
       <IDENT>BP-Willington-North</IDENT>
       <Gas>BP</Gas>
       <Location>Willington</Location>
   </row>
   <row>
       <IDENT>Exxon-Southington</IDENT>
       <Gas>Exxon</Gas>
       <Location>Southington</Location>
   </row>
   <row>
       <IDENT>Mobil-Branford-South</IDENT>
       <Gas>Mobil</Gas>
       <Location>Branford</Location>
   </row>
   <row>
       <IDENT>Mobil-Branford-North</IDENT>
       <Gas>Mobil</Gas>
       <Location>Branford</Location>
   </row>
</Gas-stations-at-rest-stops-on-routes-from-BOS-to-NYC>

In my XSLT program, I stored the three files in three variables:

<xsl:variable name="routes" select="doc('routes.xml')" />
<xsl:variable name="rest-stops" select="doc('rest-stops.xml')" />
<xsl:variable name="gas-stations" select="doc('gas-stations.xml')" />

I selected the rows in $routes that are for BOS-NYC:

<xsl:variable name="BOS-NYC-routes" 
      select="$routes//row[Start-End eq 'BOS-NYC']" 
      as="element(row)*"/>

I queried for the rest stops on those routes:

<xsl:variable name="rest-stops-on-BOS-NYC-routes" as="element(row)*">
   <xsl:for-each select="$BOS-NYC-routes">
       <xsl:variable name="route" select="." as="element(row)" />
       <xsl:sequence select="$rest-stops//row[Road eq $route/IDENT]" />
   </xsl:for-each>
</xsl:variable>

Not all the rest stops have gas stations; I queried for those with a Gas 
element:

<xsl:variable name="rest-stops-on-BOS-NYC-routes-with-gas-station" 
      select="$rest-stops-on-BOS-NYC-routes/self::row[Gas]" 
      as="element(row)*"/>

I then used the Gas element as a foreign key into gas-stations.xml:

<xsl:variable name="gas-stations-on-BOS-NYC-routes" as="element(row)*">
   <xsl:for-each select="$rest-stops-on-BOS-NYC-routes-with-gas-station">
       <xsl:variable name="rest-stop" select="." as="element(row)" />
       <xsl:sequence select="$gas-stations//row[IDENT eq $rest-stop/Gas]" />
   </xsl:for-each>
</xsl:variable>

Lastly, I displayed the results (gas stations at rest stops on routes from 
Boston to NYC):

<Gas-stations-at-rest-stops-on-routes-from-BOS-to-NYC>
   <xsl:sequence select="$gas-stations-on-BOS-NYC-routes" />
</Gas-stations-at-rest-stops-on-routes-from-BOS-to-NYC>

The complete XSLT program is shown below. I also show the code to implement 
the second task (display, for each route, the gas stations at rest stops); 
notice that the code is essentially a complete rewrite of the first task. How 
can I design the XPath queries and XSLT code so that I don't have to do a 
complete rewrite every time there is a requirement change?  /Roger

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
   xmlns:xs="http://www.w3.org/2001/XMLSchema";
   exclude-result-prefixes="xs"
   version="3.1">

   <xsl:output method="xml" />

   <xsl:variable name="routes" select="doc('routes.xml')" />
   <xsl:variable name="rest-stops" select="doc('rest-stops.xml')" />
   <xsl:variable name="gas-stations" select="doc('gas-stations.xml')" />

   <xsl:template match="/">
       <xsl:variable name="BOS-NYC-routes" select="$routes//row[Start-End eq 
'BOS-NYC']" as="element(row)*"/>
       <xsl:variable name="rest-stops-on-BOS-NYC-routes" as="element(row)*">
           <xsl:for-each select="$BOS-NYC-routes">
               <xsl:variable name="route" select="." as="element(row)" />
               <xsl:sequence select="$rest-stops//row[Road eq $route/IDENT]" 
/>
           </xsl:for-each>
       </xsl:variable>
       <xsl:variable name="rest-stops-on-BOS-NYC-routes-with-gas-station" 
select="$rest-stops-on-BOS-NYC-routes/self::row[Gas]" as="element(row)*"/>
       <xsl:variable name="gas-stations-on-BOS-NYC-routes" as="element(row)*">
           <xsl:for-each 
select="$rest-stops-on-BOS-NYC-routes-with-gas-station">
               <xsl:variable name="rest-stop" select="." as="element(row)" />
               <xsl:sequence select="$gas-stations//row[IDENT eq 
$rest-stop/Gas]" />
           </xsl:for-each>
       </xsl:variable>
       <Results>
           <Gas-stations-at-rest-stops-on-routes-from-BOS-to-NYC>
               <xsl:sequence select="$gas-stations-on-BOS-NYC-routes" />
           </Gas-stations-at-rest-stops-on-routes-from-BOS-to-NYC>
           <!-- Eek! Creating a new display of the data involves essentially 
a complete rewrite! -->
           <Gas-stations-on-each-BOS-NYC-route>
               <xsl:for-each select="$BOS-NYC-routes">
                   <xsl:variable name="route" select="." as="element(row)" />
                   <xsl:variable name="rest-stops-on-BOS-NYC-route" 
select="$rest-stops//row[Road eq $route/IDENT]" as="element(row)*" />
                   <xsl:variable 
name="rest-stops-on-BOS-NYC-route-with-gas-station" 
select="$rest-stops-on-BOS-NYC-route/self::row[Gas]" as="element(row)*"/>
                   <xsl:variable name="gas-stations-on-BOS-NYC-route" 
as="element(row)*">
                       <xsl:for-each 
select="$rest-stops-on-BOS-NYC-route-with-gas-station">
                           <xsl:variable name="rest-stop" select="." 
as="element(row)" />
                           <xsl:sequence select="$gas-stations//row[IDENT eq 
$rest-stop/Gas]" />
                       </xsl:for-each>
                   </xsl:variable>
                   <Gas-stations-on-one-route>
                       <Route>
                           <xsl:sequence select="$route" />
                       </Route>
                       <Gas-stations>
                           <xsl:sequence 
select="$gas-stations-on-BOS-NYC-route" />
                       </Gas-stations>
                   </Gas-stations-on-one-route>
               </xsl:for-each>
           </Gas-stations-on-each-BOS-NYC-route>
       </Results>
   </xsl:template>

</xsl:stylesheet>

--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--
<Prev in Thread] Current Thread [Next in Thread>