What's wrong with
tokenize(.) => random-number-generator()?permute() => string-join(" ")
Michael Kay
Saxonica
On 7 Sep 2021, at 20:20, Chris Papademetrious
christopher(_dot_)papademetrious(_at_)synopsys(_dot_)com
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:
Hi everyone,
I recently needed to write a transformation to shuffle words in text content,
but still keep the overall element structure intact. For example, I might
want to transform
<p>Hey, here is some text!</p>
into
<p>is, text Hey some here!</p>
I didn't see anything exactly like this in the list archives or in
StackOverflow, so I thought I'd share what I came up with:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl=http://www.w3.org/1999/XSL/Transform
xmlns:xs=http://www.w3.org/2001/XMLSchema
exclude-result-prefixes="#all"
version="2.0">
<xsl:output indent="yes"/>
<!-- regex that defines what a "word" is -->
<xsl:param name="word-pattern" select="'(\w+)'"/>
<!-- identity transformation -->
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<!-- shuffle words in each text() element -->
<xsl:template match="text()[not(ancestor::pre)]">
<!-- get the list of words in this block of text -->
<xsl:variable name="words" as="node()*">
<xsl:analyze-string select="." regex="{$word-pattern}">
<xsl:matching-substring>
<word><xsl:value-of select="."/></word>
</xsl:matching-substring>
</xsl:analyze-string>
</xsl:variable>
<!-- perturb the word order -->
<xsl:variable name="shuffled-words" as="xs:string*">
<xsl:call-template name="pick-random-item">
<xsl:with-param name="items" select="$words"/>
</xsl:call-template>
</xsl:variable>
<!-- reform the string with the reordered words-->
<xsl:analyze-string select="." regex="{$word-pattern}">
<xsl:matching-substring>
<xsl:variable name="this-position" select="position()"/>
<xsl:value-of select="$shuffled-words[floor(($this-position + 1) div
2)]"/>
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:value-of select="."/>
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:template>
<!-- XSLT item shuffler, borrowed from
https://stackoverflow.com/questions/21953336/randomize-node-order-xslt
-->
<xsl:param name="initial-seed" select="123"/>
<xsl:template name="pick-random-item">
<xsl:param name="items" />
<xsl:param name="seed" select="$initial-seed"/>
<xsl:if test="$items">
<!-- generate a random number using the "linear congruential generator"
algorithm -->
<xsl:variable name="a" select="1664525"/>
<xsl:variable name="c" select="1013904223"/>
<xsl:variable name="m" select="4294967296"/>
<xsl:variable name="random" select="($a * $seed + $c) mod $m"/>
<!-- scale random to integer 1..n -->
<xsl:variable name="i" select="floor($random div $m * count($items)) +
1"/>
<!-- write out the corresponding item -->
<xsl:copy-of select="$items[$i]"/>
<!-- recursive call with the remaining items -->
<xsl:call-template name="pick-random-item">
<xsl:with-param name="items" select="$items[position()!=$i]"/>
<xsl:with-param name="seed" select="$random"/>
</xsl:call-template>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
Link to XSLT Fiddle here: https://xsltfiddle.liberty-development.net/nbiE1aZ/1
The approach is:
1. Call <xsl:analyze-string> to extract the words from a text() element.
2. Call a template that shuffles the words.
3. Call <xsl:analyze-string> (again) to substitute the shuffled words in
place of the original words.
Hopefully this is helpful if someone needs to solve a similar problem in the
future!
-----
Chris Papademetrious
Tech Writer, Synopsys, Inc.
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--