On 8/21/07, David Carlisle <davidc(_at_)nag(_dot_)co(_dot_)uk> wrote:
You are using XSLT 2.0
You are using a schema-aware processor
You have a schema
The schema is parsed when the document tree is built
No, it also applies to basic (not schema aware processors) and DTD
specified element content.
saxon * B for example defaults to stripping white space in (dtd specified)
element content but has command line options to not do this or to strip
all white space (whether in element or mixed content)
This looked interesting so I did a quick example:
<root>
<node/>
</root>
and:
<xsl:value-of select="count(/root/node())"/>
returns 3 as expected.
Then if you add a DTD:
<!DOCTYPE root [
<!ELEMENT root (node)>
<!ELEMENT node (#PCDATA)>
]>
<root>
<node/>
</root>
and count the nodes again:
<xsl:value-of select="count(/root/node())"/>
the result is 1, which is demonstrating Davids point.
If your DTD specifies mixed content then you're ok:
<!DOCTYPE root [
<!ELEMENT root (#PCDATA|node)*>
<!ELEMENT node (#PCDATA)>
]>
<root>
<node/>
</root>
counting the nodes here:
<xsl:value-of select="count(/root/node())"/>
returns 3
...so it seems when the XML has been validated you can confidently
drop whitespace nodes that you know are presentational whitespace.
It all makes sense, and I suppose it must be worthwhile otherwise why bother?
The gotcha case would be something like:
<!DOCTYPE p [
<!ELEMENT p (b|i)*>
<!ELEMENT b (#PCDATA)>
<!ELEMENT i (#PCDATA)>
]>
<p>
<b>hello</b> <i>world</i>
</p>
With the DTD the output is: "helloworld"
Without the DTD the output is "hello world"
DTDs canst both addeth, and taketh awayeth :)
cheers
andrew
--
http://andrewjwelch.com
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--