Thanks Wolfgang.
I raised an issue => https://saxonica.plan.io/issues/2622
Lancelot
From: Wolfgang Laun wolfgang(_dot_)laun(_at_)gmail(_dot_)com
[mailto:xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com]
Sent: vendredi 12 février 2016 16:42
To: xsl-list
Subject: Re: [xsl] Combining use-character-maps and normalization-form="NFC"
attributes produce unwanted output
Even the solitary identity transformation of the semicolon 0x3B
<xsl:output-character character=";" string=";"/>
results in a translation to U+037E of all semicolons. Seems to be a bug.
SaxonHE 9.6.0.1
On 12 February 2016 at 15:29,
lancelot(_dot_)meurillon(_at_)oecd(_dot_)org<mailto:lancelot(_dot_)meurillon(_at_)oecd(_dot_)org>
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com<mailto:xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com>>
wrote:
XSL processor : Saxon-EE 9.5.1.8J from Saxonica
XSL version : 2.0
Dear all,
For some reasons, I need to escape specific characters in the output and also
need to produce normalised Unicode in NFC.
Here is my input :
<inputText>”; ;</ inputText > => which is \u201D + \u003B + \u0020 + \u003B
Here is the output properties of my stylesheet :
<xsl:output method="xml" version="1.0" encoding="UTF-8"
indent="yes" omit-xml-declaration="no"
use-character-maps="unsupported_characters"
normalization-form="NFC"
/>
The character-map definition :
<xsl:character-map name="unsupported_characters">
<xsl:output-character character="“" string="""/>
<xsl:output-character character="”" string="""/>
</xsl:character-map>
With this template :
<xsl:template match="/ ">
<shortDescription><xsl:value-of select=" inputText "/></shortDescription>
</xsl:template>
Now the output :
<shortDescription>"; ;</shortDescription> => which is \u0022 + \u037E + \u0020
+ \u003B
Why the semicolon (\u003B) is translated into Greek question mark (\u037E) just
after the escaped quote while the next semi colon is kept ?
But the right question is why my semicolon is escaped into Greek question mark ?
Just to go further :
1- If I do not use character-map the result is :
<shortDescription>”; ;</shortDescription> => which is \u201D + \u003B + \u0020
+ \u003B
2- If I do not normalize the Unicode (without normalization-form="NFC"
attribute)
<shortDescription>"; ;</shortDescription> => which is \u0022 + \u003B + \u0020
+ \u003B
Thanks for the help
Lancelot
XSL-List info and archive<http://www.mulberrytech.com/xsl/xsl-list>
EasyUnsubscribe<-list/2831320> (by email<>)
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--