xsl-list
[Top] [All Lists]

Re: [xsl] Unexpected MSXML Javascript extension results

2017-01-20 04:11:49
On 19.01.2017 22:23, C. Edward Porter cep(_at_)usp(_dot_)org wrote:
Hello all,



We have an XSL transformation that runs using some form of MSXML for the actual 
transformation. Given that MSXML is XSL 1.0, I am trying to code around the 
fact that it lacks regular expression functions by writing a JavaScript 
extension function to match Greek characters and wrap them in a span to avoid 
them being wrongly transformed to capital letters by smallcap treatment on the 
text around it. The JavaScript function I wrote appears to work fine if I run 
it locally, but when run as part of the XSL, it's not recursing properly. I 
don't have much of a way to debug this, so I'm hoping perhaps someone can 
either recognize why I'm getting inconsistent recursion, or perhaps suggest an 
alternative approach. Code/sample text below:



JavaScript Function:

 <msxsl:script language="javascript" implements-prefix="uspc"><![CDATA[

//Recursive function to wrap greek characters in nosmallcaps span

        function wrapGreek(str){

                var regex = /(.*)([α-ωΑ-Ω])(.*)/g;

                var m = regex.exec(str);

                var rStr = "";

                if(m != null){

                        rStr = wrapGreek(m[1]);

                } else {

                        rStr = str;

                }

                if(m != null) {

                        rStr += '<span class="nosmallcaps">' + m[2] + '</span>' 
+ m[3];

                }

                        return "" + rStr;

        }

]]></msxsl:script>



Text template:

<xsl:template match="text()" priority="20">

        <xsl:variable name="textout">

                <xsl:value-of select="."/>

        </xsl:variable>

        <xsl:value-of disable-output-escaping="yes" 
select="uspc:wrapGreek($textout)"/>

</xsl:template>



Sample Content:

This head composed correctly:

<title cid="pttFA" 
id="GUID-04193AE6-05B6-44C1-AB2A-E8A900F93CE3">DESαCRIPαTION</title>



For this one, only the second alpha symbol was wrapped, so it did not recurse. 
Copying this text to a JavaScript interpreter and running the function on it in 
isolation does appear to recurse correctly.

<title cid="1IMidE" id="GUID-D279E612-1350-4709-B662-8FFEF76CD0C0">Infrαared Absoαrption, <i 
cid="2KvZ79">Spectrophotometric Identification Tests, Appendix IIIC</i></title>

It seems you simply want to do a string replacement so I wonder why you need the exec method and can't simply use the replace methods on strings e.g.


function wrapGreek(str) {
return str.replace(/[α-ωΑ-Ω]+/g, function(m) { return '<span class="nosmallcaps">' + m + '<\/span>'});
}

A full stylesheet is

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
        xmlns:mf="http://example.com/mf";
        xmlns:ms="urn:schemas-microsoft-com:xslt"
        exclude-result-prefixes="mf ms"
        version="1.0">
        
        <ms:script language="JScript" implements-prefix="mf">
                <![CDATA[
                function wrapGreek(str) {
return str.replace(/[α-ωΑ-Ω]+/g, function(m) { return '<span class="nosmallcaps">' + m + '<\/span>'});
        }
                ]]>
        </ms:script>
        
        <xsl:template match="@* | node()">
                <xsl:copy>
                        <xsl:apply-templates select="@* | node()"/>
                </xsl:copy>
        </xsl:template>

        <xsl:template match="text()" priority="20">           
<xsl:value-of disable-output-escaping="yes" select="mf:wrapGreek(string())"/>
        </xsl:template>
        
</xsl:stylesheet>

I have tested it on oXygen with MSXML 3 and the two .NET implementations offered there to support the ms:script element and the input

<?xml version="1.0" encoding="UTF-8"?>
<root>
<title cid="pttFA" id="GUID-04193AE6-05B6-44C1-AB2A-E8A900F93CE3">DESαCRIPαTION</title> <title cid="1IMidE" id="GUID-D279E612-1350-4709-B662-8FFEF76CD0C0">Infrαared Absoαrption, <i cid="2KvZ79">Spectrophotometric Identification Tests, Appendix IIIC</i></title>
</root>

is transformed into

<root>
<title cid="pttFA" id="GUID-04193AE6-05B6-44C1-AB2A-E8A900F93CE3">DES<span class="nosmallcaps">α</span>CRIP<span class="nosmallcaps">α</span>TION</title> <title cid="1IMidE" id="GUID-D279E612-1350-4709-B662-8FFEF76CD0C0">Infr<span class="nosmallcaps">α</span>ared Abso<span class="nosmallcaps">α</span>rption, <i cid="2KvZ79">Spectrophotometric Identification Tests, Appendix IIIC</i></title>
</root>

--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--

<Prev in Thread] Current Thread [Next in Thread>