On 19.01.2017 22:23, C. Edward Porter cep(_at_)usp(_dot_)org wrote:
Hello all,
We have an XSL transformation that runs using some form of MSXML for the actual
transformation. Given that MSXML is XSL 1.0, I am trying to code around the
fact that it lacks regular expression functions by writing a JavaScript
extension function to match Greek characters and wrap them in a span to avoid
them being wrongly transformed to capital letters by smallcap treatment on the
text around it. The JavaScript function I wrote appears to work fine if I run
it locally, but when run as part of the XSL, it's not recursing properly. I
don't have much of a way to debug this, so I'm hoping perhaps someone can
either recognize why I'm getting inconsistent recursion, or perhaps suggest an
alternative approach. Code/sample text below:
JavaScript Function:
<msxsl:script language="javascript" implements-prefix="uspc"><![CDATA[
//Recursive function to wrap greek characters in nosmallcaps span
function wrapGreek(str){
var regex = /(.*)([α-ωΑ-Ω])(.*)/g;
var m = regex.exec(str);
var rStr = "";
if(m != null){
rStr = wrapGreek(m[1]);
} else {
rStr = str;
}
if(m != null) {
rStr += '<span class="nosmallcaps">' + m[2] + '</span>'
+ m[3];
}
return "" + rStr;
}
]]></msxsl:script>
Text template:
<xsl:template match="text()" priority="20">
<xsl:variable name="textout">
<xsl:value-of select="."/>
</xsl:variable>
<xsl:value-of disable-output-escaping="yes"
select="uspc:wrapGreek($textout)"/>
</xsl:template>
Sample Content:
This head composed correctly:
<title cid="pttFA"
id="GUID-04193AE6-05B6-44C1-AB2A-E8A900F93CE3">DESαCRIPαTION</title>
For this one, only the second alpha symbol was wrapped, so it did not recurse.
Copying this text to a JavaScript interpreter and running the function on it in
isolation does appear to recurse correctly.
<title cid="1IMidE" id="GUID-D279E612-1350-4709-B662-8FFEF76CD0C0">Infrαared Absoαrption, <i
cid="2KvZ79">Spectrophotometric Identification Tests, Appendix IIIC</i></title>
It seems you simply want to do a string replacement so I wonder why you
need the exec method and can't simply use the replace methods on strings
e.g.
function wrapGreek(str) {
return str.replace(/[α-ωΑ-Ω]+/g, function(m) { return '<span
class="nosmallcaps">' + m + '<\/span>'});
}
A full stylesheet is
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:mf="http://example.com/mf"
xmlns:ms="urn:schemas-microsoft-com:xslt"
exclude-result-prefixes="mf ms"
version="1.0">
<ms:script language="JScript" implements-prefix="mf">
<![CDATA[
function wrapGreek(str) {
return str.replace(/[α-ωΑ-Ω]+/g, function(m) { return '<span
class="nosmallcaps">' + m + '<\/span>'});
}
]]>
</ms:script>
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="text()" priority="20">
<xsl:value-of disable-output-escaping="yes"
select="mf:wrapGreek(string())"/>
</xsl:template>
</xsl:stylesheet>
I have tested it on oXygen with MSXML 3 and the two .NET implementations
offered there to support the ms:script element and the input
<?xml version="1.0" encoding="UTF-8"?>
<root>
<title cid="pttFA"
id="GUID-04193AE6-05B6-44C1-AB2A-E8A900F93CE3">DESαCRIPαTION</title>
<title cid="1IMidE"
id="GUID-D279E612-1350-4709-B662-8FFEF76CD0C0">Infrαared Absoαrption, <i
cid="2KvZ79">Spectrophotometric Identification Tests, Appendix
IIIC</i></title>
</root>
is transformed into
<root>
<title cid="pttFA"
id="GUID-04193AE6-05B6-44C1-AB2A-E8A900F93CE3">DES<span
class="nosmallcaps">α</span>CRIP<span
class="nosmallcaps">α</span>TION</title>
<title cid="1IMidE"
id="GUID-D279E612-1350-4709-B662-8FFEF76CD0C0">Infr<span
class="nosmallcaps">α</span>ared Abso<span
class="nosmallcaps">α</span>rption, <i cid="2KvZ79">Spectrophotometric
Identification Tests, Appendix IIIC</i></title>
</root>
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--