xsl-list
[Top] [All Lists]

[xsl] Character encoding/representation from ISO-8859-1 to UTF-8

2016-10-11 13:59:30
Hi all,

I'm struggling with a character encoding issue (or a character
representation issue maybe?): I have input XML that looks like this

input.xml
<?xml version="1.0" encoding="iso-8859-1"?>
<documents>
<document>The reality of the effect of natural ventilation in a residential
attic cavity has been the topic of many debates and scholarly reports since
the 1930’s.</document>
</documents>

and I would like to get it to a point where the characters are represented
properly, i.e.

output.xml
<?xml version="1.0" encoding="UTF-8"?>
<documents>
<document>The reality of the effect of natural ventilation in a residential
attic cavity has been the topic of many debates and scholarly reports since
the 1930’s.</document>
</documents>

Thanks to Liam's help on irc and reading through the list archives, it
seems like an identity transform should be the right step towards getting
the representation corrected, but something isn't working (or I have a
misunderstanding somewhere).

If I apply the following identity transform with Saxon HE 9.6.0.7 in oXygen
18:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
version="2.0">
<xsl:output encoding="UTF-8" indent="yes"/>
<xsl:template match="/"><xsl:copy-of select="/"/></xsl:template>
</xsl:stylesheet>

I get the following result:
<?xml version="1.0" encoding="UTF-8"?>
<documents>
 <document>The reality of the effect of natural ventilation in a
residential attic cavity has been the topic of many debates and scholarly
reports since the 1930â&#x80;&#x99;s.</document>
</documents>

Could someone provide some insight into what I've done wrong here? Any help
would be greatly appreciated.

Best,
Bridger
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--
<Prev in Thread] Current Thread [Next in Thread>