xsl-list
[Top] [All Lists]

[xsl] replace(), translate() and Unicode supplementary characters

2011-06-02 23:00:53

I've got an XSLT 2.0 stylesheet with the following my:print_ipa() function 
defined,
using replace() and translate() with Unicode supplementary characters

<?xml version="1.0" encoding="UTF-8"?>

<xsl:stylesheet version="2.0" 
        xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
        xmlns:my="myfunctions"
        exclude-result-prefixes="my">

        <xsl:output method="text"/>

        <xsl:function name="my:print_ipa">
                <xsl:param name="txt"/>
                <xsl:text>{\ipa </xsl:text>
                <xsl:value-of select="translate(
                        
replace(replace(replace(replace($txt,'𐐴','aʲ'),'𐐵','aʷ'),'𐑎','ɔʲ'),'𐑏','ʲu'),
                        '𐐨𐐩𐐪𐐫',
                        'ieɑɔ')"/>
                <xsl:text>}</xsl:text>
        </xsl:function>

...
</xsl:stylesheet>

And I can't seem to get the translate() and replace() calls to work (I'm using 
saxonhe9-3).

The intended purpose of the function is to translate UTF-8 strings in the 
Deseret Alphabet
to equivalent strings in the International Phonetic Alphabet (IPA).

I realize that the characters may not display correctly in your mail 
application.

The first argument to translate() is 
replace(replace(replace(replace($txt,'𐐴','aʲ'),'𐐵','aʷ'),'𐑎','ɔʲ'),'𐑏','ʲu')
where the second arguments to replace() are strings containing one Unicode 
supplementary character,
and the replacements (the third arguments) are strings containing Unicode IPA 
characters, which
are in the BMP.  The characters in the first arguments are

U+00010434
U+00010435
U+0001044E
U+0001044F

Similarly in the translate() call, the second argument consists of the string

'𐐨𐐩𐐪𐐫'

i.e.  U+00010428  U+00010429  U+0001042A  U+0001042B

and the third argument is

'ieɑɔ'

i.e.  U+0069  U+0065  U+0251  U+0254

The intent is for U+00010428 to get replaced by U+0069, etc.

Questions:  Are translate() and replace() supposed to work with Unicode 
supplementary characters?
If so, what am I doing wrong?

Thanks,

Ken


******************************
Kenneth R. Beesley, D.Phil.
P.O. Box 540475
North Salt Lake, UT
84054  USA






--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--

<Prev in Thread] Current Thread [Next in Thread>