xsl-list
[Top] [All Lists]

RE: Generating numeric character references

2003-01-16 07:56:06
Use a text editor, or perhaps a SAX filter, to replace "&#" by "&".
Why use a power drill when you can do the job with a hammer?

Michael Kay
Software AG
home: Michael(_dot_)H(_dot_)Kay(_at_)ntlworld(_dot_)com
work: Michael(_dot_)Kay(_at_)softwareag(_dot_)com 

-----Original Message-----
From: owner-xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com 
[mailto:owner-xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com] On Behalf Of 
Stuart Celarier
Sent: 14 January 2003 20:06
To: 'XSL-List'
Subject: [xsl] Generating numeric character references


I'd like to transform specific text subtrings into numeric 
character references during in an XSLT transformation. For 
example, I want to transform all occurrences that look like 
"­" within a string into "&#173".

Here's the back story. I have source XML that is generated 
automatically from HTML by a third-party. The third-party 
incorrectly handles entity references, so that "­" in 
the original HTML in becomes "­" in the XML. I want 
to restore the damage done. To simplify things, I am only 
interested in documents with ISO 8859-1 encoding.

Below is a solution [1] that I am not pleased with. It is a 
named template that recursively parses a string, trying to 
replace references. This requires an <xsl:when> element for 
each value of numeric character reference that might be 
encountered (see the "additional cases here" comment). 
Problems with this include linear search of values, omitted 
values, and opportunity for error in mismatched values.

Can anyone suggest a better approach to generating numeric 
character references? I am would be fine restricting the 
solution to MSXML or .NET's System.Xml.Xsl XSLT processors, 
if that is an issue.

Many thanks!

Cheers,
Stuart



[1] A less than happy solution:

  <xsl:template name="restoreNumCharRefs">
    <xsl:param name="string"/>

    <xsl:choose>
      <xsl:when test="contains($string, '&amp;')">
        <xsl:variable name="head" 
select="substring-before($string, '&amp;')"/>
        <xsl:variable name="remainder" 
select="substring-after($string, '&amp;')"/>
        <xsl:variable name="reference" 
select="substring-before($remainder, ';')"/>

        <xsl:variable name="entity">
          <xsl:choose>
            <xsl:when test="$reference='#167'">&#167;</xsl:when>
            <xsl:when test="$reference='#173'">&#173;</xsl:when>

            <!-- additional cases here -->

            <xsl:otherwise>&amp;<xsl:value-of 
select="$reference"/>;</xsl:otherwise>
          </xsl:choose>
        </xsl:variable>

        <xsl:variable name="tail">
          <xsl:call-template name=" restoreNumCharRefs">
            <xsl:with-param name="string" 
select="substring-after($remainder, ';')"/>
          </xsl:call-template>
        </xsl:variable>

        <xsl:value-of select="concat($head, $entity, $tail)"/>
      </xsl:when>
      <xsl:otherwise>
        <xsl:value-of select="$string"/>
      </xsl:otherwise>
    </xsl:choose>

  </xsl:template> 


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list




 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list