xsl-list
[Top] [All Lists]

Re: [xsl] replacing diacritical marks with combining unicode characters

2008-03-04 14:43:12
I have decided to try a character map. When I used the sheet below, however, it converts <br /> to <br>:

?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"; version="2.0">


        <xsl:output use-character-maps="cm1"/>

        <xsl:character-map name="cm1">
<xsl:output-character character="&#728;" string="&amp;#774;"/><!--breve--> <xsl:output-character character="&#175;" string="&amp;#772;"/> <!-- macron -->
        </xsl:character-map>

    <xsl:template match="br"/>

        <xsl:template match="@*|node()">
            <xsl:copy>
                <xsl:apply-templates select="@*|node()"/>
            </xsl:copy>
        </xsl:template>

    </xsl:stylesheet>

I really don't need the <br /> elements. That is the reason for the template rule removing them. But is the switch from <br/> to <br> caused by the output line?

Terry Ofner
1541 Northbrook Drive
Indianapolis, IN 46260
Voice: 317-870-1992
Fax: 317-870-7101

tofner(_at_)comcast(_dot_)net




On Mar 4, 2008, at 2:09 PM, Michael Kay wrote:


The function fn:normalize-unicode() will do what you want,
with a second argument of "NFC".

I'm not sure it will, because the input is using non-combining diacritical
marks. I think the answer is translate():

translate($in, '&#728;...', '&#774;...'

Michael Kay
http://www.saxonica.com/


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--



--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--