xsl-list
[Top] [All Lists]

Re: [xsl] How can the mere switch from DTD to XSD in the source document affect how a stylesheet handles white space?

2021-02-22 13:41:25
On 22.02.2021 20:24, Wolfhart Totschnig 
wolfhart(_dot_)totschnig(_at_)mail(_dot_)udp(_dot_)cl
wrote:

After switching from DTD to XSD in my project, I encountered -- apart
from the odd behavior of Chrome described in my post from two days ago
-- another puzzling problem, namely that, after this switch, one of the
stylesheets of my project produced unexpected output, specifically in
the handling of white space. I was able to find a solution to the
problem. Still, I do not understand how the problem could arise. That
is, I do not understand how the mere switch from DTD to XSD in the
source file can affect how a stylesheet handles white space. I would
like to ask whether one of you can explain it to me.

In order to show the phenomenon, I have produced the following minimal
example.

Before the switch from DTD to XSD, I had the following XML document:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="../zettel.xsl"?>
<!DOCTYPE zettel SYSTEM "../zettel.dtd">
<zettel>
    <head>
       <keywords>
          <author>
             <first>Sally</first>
             <last>Adee</last>
          </author>
       </keywords>
    </head>
</zettel>

I transformed this document with the following stylesheet:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
     <xsl:output method="text" omit-xml-declaration="yes"/>
     <xsl:template match="author">
         <xsl:text>Some text: </xsl:text>
         <xsl:value-of select="."/>
         <xsl:text>&#10;</xsl:text>
     </xsl:template>
</xsl:stylesheet>

This produced the following output (with Saxon 9 HE):

Some text: SallyAdee

Then I switched from DTD to XSD. That is, now the source document looks
like this:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="../zettel.xsl"?>
<zettel xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
xsi:noNamespaceSchemaLocation="../zettel.xsd">
     <head>
         <keywords>
             <author>
                 <first>Sally</first>
                 <last>Adee</last>
             </author>
         </keywords>
     </head>
</zettel>

And now the output of the same stylesheet is different:

             Some text:
                 Sally
                 Adee

That is, there is a lot of additional white space. (I am here omitting
from the output several empty lines.) I figured out that I need to add
"<xsl:strip-space elements="*"/>" to the stylesheet to receive the old
output with the new source document. But I do not understand why this is
necessary, or why it was not necessary before. How can the mere switch
from DTD to XSD in the source document change how the stylesheet handles
white space? The DTD and the XSD are, as far as I can tell, equivalent
(i.e., the XSD is a translation of the DTD), and neither, as far as I
can see, says anything about the handling of white space. So I am at a
loss.

How exactly do you run Saxon? I think it might depend on the behaviour
of the underlying parser and I am not sure Saxon HE, as it itself does
not support schema-aware XSLT, uses any xsi:schemaLocation hint or
passes it to the XML parser, so basically the schema is ignored, while
the default setting for XML based parsing might use a parser reading and
taking the DTD into account. I would expect Saxon EE with
schema-validation turned on for parsing to give a different output for
the second sample.
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--


<Prev in Thread] Current Thread [Next in Thread>