xsl-list
[Top] [All Lists]

Re: [xsl] Techniques for Sorting and Reducing Maps in XSLT 3/XPath 3?

2018-07-06 03:46:46
Quite different reasons, I needed to play with a directory structure.
ls dir > x.txt
Then read that directory using XSLT unparsed text
and play with it using XSLT tools?
  Bit easier, or are you limited to XSLT only?

HTH


On 5 July 2018 at 21:50, Eliot Kimber ekimber(_at_)contrext(_dot_)com
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:
I need to process a set of documents organized into directories where for a 
given parent directory there may be any number of subdirectories representing 
multiple versions of the same logical artifact, where the directory name 
reflects the versions, e.g.:

/A/B/C/en/1.0/foo.xml
/A/B/C/en/1.2/foo.xml
/A/B/C/fr/1.0/foo.xml
/A/B/C/fr/1.2/foo.xml
/A/B/C/fr/1.3/foo.xml
/A/B/D/en/1.0/foo.xml
/A/B/D/en/1.2/foo.xml
/A/B/D/en/1.3/foo.xml
/A/B/D/en/1.4/foo.xml
/A/B/D/fr/1.0/foo.xml
/A/B/D/fr/1.2/foo.xml
/A/B/D/fr/1.3/foo.xml

I need to process only those foo.xml files that are the latest version under 
a given common ancestor (i.e., the latest version for each language, where 
the /A/B/C path represents a single course in this case).

I'm doing this entirely within XSLT 3 (rather than using e.g., a bash shell 
to determine the set of files to process), mostly because I'm tasked with 
inserting an XSLT transform into an existing system where adding anything 
other than an XSLT is problematic.

But I think this also serves as a useful exercise in general XSLT/XPath map 
manipulation, at least as I've initially gone about trying to solve this 
problem.

Given the list of URLs for all of these foo.xml files I want to reduce it to 
just /A/B/C/en/1.3/foo.xml, /A/B/C/fr/1.2/foo.xml, /A/B/D/en/1.4/foo.xml, and 
/A/B/D/fr/1.3/foo.xml

That is, for each locale in each course, get the latest version.

In addition, I want to group the files by the 3rd directory ("C", or "D"), 
which serves as a "course ID.

Maps seem like an obvious way to do this:

1. Use Saxon's collection() function with the metadata=yes option to get a 
set of maps, one for each file, that includes the full path to the file (this 
avoids loading a bunch of files I don't actually want and gives me maps as a 
starting point).

2. Using these maps, add the version, locale, and 3rd-level directory name as 
separate entries in each map, creating a more complete set of "descriptor" 
maps that make it easy to access to relevant fields I care about.

3. Create a new map where the keys are 3rd directory name ("course ID") and 
the values are the descriptor maps a given course id/locale pair with the 
highest version.

My question: How best to implement step 3?

Step 2 is simple data processing: pull apart each URL and create the maps.

Step 3 is less obvious because you have to compare entries based on both the 
course ID and version values.

My initial solution for step 3 is to use xsl:iterate to construct a result 
map:

    <xsl:variable name="courses-by-id" as="map(xs:string, map(*)*)">
      <xsl:iterate select="$configs-to-use">
        <xsl:param name="result-map" as="map(xs:string, map(*))" 
select="map{}"/>
        <xsl:on-completion>
          <xsl:sequence select="$result-map"/>
        </xsl:on-completion>

        <xsl:variable name="this-version" as="xs:double" 
select="xs:double(.?version)"/>
        <xsl:variable name="previous-course-entry" as="map(*)?"
          select="map:get($result-map, .?course-id)"
        />
        <xsl:variable name="test-version" as="xs:double"
          select="
            if (exists($previous-course-entry))
            then xs:double($previous-course-entry?version)
            else 0.0
          "
        />
        <xsl:next-iteration>
          <xsl:with-param name="result-map" as="map(xs:string, map(*))"
            select="
                    if ($this-version gt $test-version)
                    then map:put($result-map, .?course-id, .)
                    else $result-map"
          />
        </xsl:next-iteration>
      </xsl:iterate>

This works (or at least appears to in my initial small tests) but it feels 
like there ought to be a less verbose way to do this same kind of operation.

What is the better way to do this kind of "find the map entries that meet a 
specific requirement relative to other members of the map" processing?

Thanks,

Eliot
--
Eliot Kimber
http://contrext.com





-- 
Dave Pawson
XSLT XSL-FO FAQ.
Docbook FAQ.
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--

<Prev in Thread] Current Thread [Next in Thread>