xsl-list
[Top] [All Lists]

RE: [xsl] > replaced by ">", < is not replaced...

2007-07-13 03:44:11
You're sending the transformation output to a DOMResult, so the
serialization is presumably being done by the DOM implementation, not by an
XSLT processor. So it's being serialized as XML, in which ">" is perfectly
valid in a text node. If you want HTML output, use a StreamResult so that
the XSLT serializer is invoked.

You haven't actually answered the question about which XSLT processor you
are using. IIRC XmlObject is part of XmlBeans and XmlBeans uses Saxon, at
least in some configurations... It does help to know what processor you are
using. You can do system-property('xsl:vendor') to find out.

Michael Kay
http://www.saxonica.com/ 

-----Original Message-----
From: Jethro Borsje [mailto:jethro(_at_)jesdesign(_dot_)nl] 
Sent: 13 July 2007 11:11
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: Re: [xsl] > replaced by ">", < is not replaced...

Hi there,

This is the Java code that is used for the transformation:
[code]
    private String convertSelectionToHTML(String p_selection)
    {
       // setup error logging
       XmlOptions validateOptions = new XmlOptions();
       ArrayList<XmlError> errorList = new ArrayList<XmlError>();
       validateOptions.setErrorListener(errorList);

       try
       {
          Transformer transfomer = getTransformer();

          logger.debug("------------------------------------");
          logger.debug("Parsing body[" + p_selection + "]");
          XmlObject bodyObject = XmlObject.Factory.parse(p_selection,
validateOptions);

          // transform body
          DOMResult bodyTransformResult = new DOMResult();
          DOMSource bodyTransformSource = new 
DOMSource(bodyObject.getDomNode());
          transfomer.transform(bodyTransformSource, 
bodyTransformResult);
          bodyObject =
XmlObject.Factory.parse(bodyTransformResult.getNode());

          logger.debug("after transformation: " + 
bodyObject.toString());
          logger.debug("------------------------------------");

          return bodyObject.xmlText();
       }
       catch (XmlException e)
       {
          logger.error("Unable to parse body: " + p_selection, e);
          if (!errorList.isEmpty())
          {
             for (XmlError error : errorList)
             {
                logger.error("\t-" + error.getMessage() + 
"\n\t\tLocation of invalid XML: "
                      + error.getCursorLocation().xmlText() + "\n");
             }
          }
       }
       catch (TransformerException e)
       {
          logger.error("Unable to parse body: " + p_selection, e);
          if (!errorList.isEmpty())
          {
             for (XmlError error : errorList)
             {
                logger.error("\t-" + error.getMessage() + 
"\n\t\tLocation of invalid XML: "
                      + error.getCursorLocation().xmlText() + "\n");
             }
          }
       }
       return null;
    }

    private Transformer getTransformer()
    {
       Transformer result = null;
       TransformerFactory transformerFactory = 
TransformerFactory.newInstance();
       try
       {
          result = transformerFactory.newTransformer(new
StreamSource(this.getClass().getClassLoader()
                .getResourceAsStream("selection-view.xsl")));
       }
       catch (TransformerConfigurationException e)
       {
          logger.error("Error creating transformer", e);
       }
       return result;
    }
[/code]

Michael Kay wrote:
Actually, &lt; and &gt; were replaced by "<" and ">" respectively 
while parsing; the difference is that during serialization, "<" has 
been converted back to "&lt;", but ">" has not been 
converted back to 
"&gt;". This caused me a little confusion in reading your message!

What XSLT processor did you use and how did you run it? Are 
you sure 
the serialization was done by an XSLT processor? I'm 
puzzled because 
there's no evidence that it used the HTML output method, 
which it should have done.
When serializing as XML, there is no need to write ">" as 
"&gt;", but 
in HTML, the HTML spec advises that this "should" be done. The XSLT 
2.0 serialization specification, surprisingly, seems to 
have nothing 
to say on the subject.

Michael Kay
http://www.saxonica.com/

-----Original Message-----
From: Jethro Borsje [mailto:jethro(_at_)jesdesign(_dot_)nl]
Sent: 13 July 2007 10:07
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: [xsl] &gt; replaced by ">", &lt; is not replaced...

Hi everybody,

I am trying to transform a HTML page using XSL, the 
problem is that 
somehow my "&gt;" signs in the input text are changed to ">" while 
"&lt;" are not changed. This XSL I am using:
[stylesheet]
<?xml version="1.0" encoding="ISO-8859-1"?> <xsl:stylesheet 
version="2.0"
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>

   <xsl:template match="/">
           <html>
                   <head>
                           <style>
                                   body
                                   {
                                           
font-family:'Courier New', Courier, monospace;
                                           font-size:11px;
color:#333333;
                                           font-weight:normal;
                                           line-height: 140%;
                                           text-align:justify;

                                   }
                                   span.rule
                                   {
                                           font-weight:bold;
                                   }
                                   span.issuer, span.target
                                   {
                                           font-weight:bold;
                                           display:inline;
                                   }
                           </style>
                   </head>
                   <body>
                           <xsl:apply-templates />
                   </body>
           </html>
   </xsl:template>

   <xsl:template match="br">
           <xsl:element name="br"></xsl:element>
   </xsl:template>

   <!-- Copy all <span> tags together with the attributes. -->
   <xsl:template match="span">
           <xsl:element name="span">
                   <xsl:attribute name="id"><xsl:value-of 
select="@id" 
/></xsl:attribute>
                   
                   <xsl:if test="@style">
                           <xsl:attribute
name="style"><xsl:value-of select="@style" 
/></xsl:attribute>
                   </xsl:if>
                   
                   <xsl:if test="@class">
                           <xsl:attribute
name="class"><xsl:value-of select="@class" 
/></xsl:attribute>
                   </xsl:if>
           
                   <xsl:value-of select="." />
           </xsl:element>
   </xsl:template>

</xsl:stylesheet>
[/stylesheet]

This is the text that is being parsed:
[parsed text]
<html>
   <body>
           <span class="target" 
id="http://www.owl-ontologies.com/Ontology1182253177.owl#WHITB
READ">&lt;WTB.L&gt;</span>
said on Monday it was considering the sal
   </body>
</html>
[/parsed text]

This is the text after transformation:
[transformed text]
<html>
   <head>
   <style>
           body
           {
                   font-family:'Courier New', Courier, monospace;
                   font-size:11px; color:#333333;
                   font-weight:normal;
                   line-height: 140%;
                   text-align:justify;
           }
           span.rule
           {
                   font-weight:bold;
           }
           span.issuer, span.target
           {
                   font-weight:bold;
                   display:inline;
           }
   </style>
   </head>
<body>
   <span class="target" 
id="http://www.owl-ontologies.com/Ontology1182253177.owl#WHITB
READ">&lt;WTB.L></span>
said on Monday it was considering the sal </body> </html> 
[/transformed text]

As you can see the "&gt;" is replaced by ">", however the 
"&lgt;" is 
NOT replaced by "<". I do not understand how this is possible. The 
desired result is that they both do NOT get replaced, so 
both "&gt;" 
and "&lt;"
should appear in the transformed text.

--
Best regards,
Jethro Borsje

http://www.jborsje.nl


--~------------------------------------------------------------------
XSL-List info and archive:  
http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--




--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--



--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--