xsl-list
[Top] [All Lists]

RE: copy-of "canonicalization" behavior in Xalan (Java)

2004-07-23 00:45:06
The copy-of element when processed by Xalan (Java) appears to 
canonicalize the output, rather than output the source tree exactly.

For specific nodes in the source tree I would like to create 
an identical copy in the result tree, including redundant namespace
declarations.

A tree in the XPath data model does not contain namespace declarations, it
contains namespace nodes. When you parse XML, every element node in the
resulting tree will have one namespace node for each namespace that is in
scope. When a tree is serialized, namespace declarations are added to the
output where needed: serializers will generally avoid outputting redundant
declarations.

The system is indeed creating an identical copy of the tree. The information
that's being lost is being lost at the time the original tree is
constructed.

The XML Spy behavior is conformant too, because there's nothing in the spec
that prevents a processor retaining this extra information (it could also
remember whether the namespace URI was written in single or double quotes if
it chose to).

If you want to extract fragments of the result tree for subsequent
processing, this will work fine if you do the extraction using tools that
respect the XPath data model. If you try and do it using textual
cut-and-paste, it will fail. One of the drawbacks of XML Namespaces has
always been that textual cut-and-paste is no longer a viable approach.

Michael Kay



Assume a source document like:

<foo:root xmlns:foo="http://abc.org/foo#"; 
xmlns:xyz="http://xyzinc.com/xyz#";>
      <foo:parent xmlns:foo="http://abc.org/foo#";>
              <foo:child xmlns:foo="http://abc.org/foo#";>more 
text</foo:child>
              <xyz:child 
xmlns:xyz="http://xyzinc.com/xyz#";>yet more text</xyz:child>
      </xyz:parent>
</foo:root>

The namespace declarations on the parent and child nodes are 
redundant (their namespace prefixes have been bound to a namespace on
the root node).

When I use copy-of, such as in the simple template below, in 
XML Spy using its built in XSLT processor the result tree is an exact
and complete copy of the source tree, redundant namespace 
declarations and all, as I would expect.

<xsl:template match ="/">
  <xsl:copy-of select="(.)"/>
</xsl:template>

(I have simplified the template in the extreme to make it clear.)

When I run the same template with the Xalan (Java) XSLT 
processor, which uses a SAX parser, I get a "cleaned", 
canonical form of the
source tree as my result, with all redundant namespace 
declarations removed.

This may appear to be a benefit, but I later manipulate parts 
of the result tree (which is much more complex than the simple
example) as separate XML fragments and at that point the 
namespace declarations are in fact no longer redundant but critical.

I have not been able to find anything in the Xalan 
documentation which suggests a way to avoid this 
canonicalization - perhaps its a
SAX issue? Is there a way to force Xalan to make an exact 
copy of the source tree, warts and all?


Thanks



--+------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--+--