xsl-list
[Top] [All Lists]

Generating numeric character references

2003-01-14 13:05:44
I'd like to transform specific text subtrings into numeric character
references during in an XSLT transformation. For example, I want to
transform all occurrences that look like "­" within a string
into "&#173".

Here's the back story. I have source XML that is generated automatically
from HTML by a third-party. The third-party incorrectly handles entity
references, so that "­" in the original HTML in becomes
"­" in the XML. I want to restore the damage done. To simplify
things, I am only interested in documents with ISO 8859-1 encoding.

Below is a solution [1] that I am not pleased with. It is a named
template that recursively parses a string, trying to replace references.
This requires an <xsl:when> element for each value of numeric character
reference that might be encountered (see the "additional cases here"
comment). Problems with this include linear search of values, omitted
values, and opportunity for error in mismatched values.

Can anyone suggest a better approach to generating numeric character
references? I am would be fine restricting the solution to MSXML or
.NET's System.Xml.Xsl XSLT processors, if that is an issue.

Many thanks!

Cheers,
Stuart



[1] A less than happy solution:

  <xsl:template name="restoreNumCharRefs">
    <xsl:param name="string"/>

    <xsl:choose>
      <xsl:when test="contains($string, '&amp;')">
        <xsl:variable name="head" select="substring-before($string,
'&amp;')"/>
        <xsl:variable name="remainder" select="substring-after($string,
'&amp;')"/>
        <xsl:variable name="reference"
select="substring-before($remainder, ';')"/>

        <xsl:variable name="entity">
          <xsl:choose>
            <xsl:when test="$reference='#167'">&#167;</xsl:when>
            <xsl:when test="$reference='#173'">&#173;</xsl:when>

            <!-- additional cases here -->

            <xsl:otherwise>&amp;<xsl:value-of
select="$reference"/>;</xsl:otherwise>
          </xsl:choose>
        </xsl:variable>

        <xsl:variable name="tail">
          <xsl:call-template name=" restoreNumCharRefs">
            <xsl:with-param name="string"
select="substring-after($remainder, ';')"/>
          </xsl:call-template>
        </xsl:variable>

        <xsl:value-of select="concat($head, $entity, $tail)"/>
      </xsl:when>
      <xsl:otherwise>
        <xsl:value-of select="$string"/>
      </xsl:otherwise>
    </xsl:choose>

  </xsl:template> 


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list