Perplexing indeed.
I'd be less surprised if the output came out as "<" rather that
"<<". That's much more common, and could be caused by processing text
twice when it should only be processed once.
The conversion from "<" to "<" is done by the XML serializer. The fact
that you're using the Saxon XSLT processor doesn't necessarily mean that
you're using the Saxon serializer (the Saxon output could be sent to a DOM
which is then serialized using the DOM serializer); it would be a good idea
to find out what serializer is actually being used. The easiest way to find
out is to see whether the serialization is affected by xsl:output
declarations in the stylesheet.
How did you satisfy yourself that both the successful and the unsuccessful
runs are using Saxon 6.5.5? JAXP is a wonderful beast, and ensures that many
people are running a different XSLT processor from the one they thought they
were using.
Michael Kay
http://www.saxonica.com/
-----Original Message-----
From: Anderson, Paul [mailto:Paul(_dot_)Anderson(_at_)compuware(_dot_)com]
Sent: 11 December 2007 23:07
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: [xsl] Escaped characters being duplicated
Greetings All,
We have a bunch of DITA XML content and we're using the
open-source DITA Open Toolkit to transform it into a variety
of outputs. The DITA Open Toolkit is a collection of Java
classes, XSL stylesheets, and ANT scripts that transform the
content and create the output.
To shield our users from the command-line invocation of the
publishing scripts, we deployed a simple web application
running on Tomcat 5.5 that takes input from a JSP page and
invokes the necessary ANT script to generate the desired
output for the user. This methodology has been working quite
nicely for nearly a year.
Over that time, a few of our users are having a problem where
characters escaped in the XML content (for example, angle
brackets and ampersands) are duplicated in the output. For
example, in the place of one angle-bracket (<), we end up
with two or sometimes four escaped angle brackets (<<<<).
I've been troubleshooting the problem and the duplication
always appears in the output files generated by one of the
XSL stylesheets in the DITA Open Toolkit. If the input file
contained an escaped character, the output file contains two
of those escaped characters. The most interesting discovery
so far is this: For each user that has the problem, the
problem goes away if they invoke the ANT script via the
command line; the duplication only occurs when the ANT script
is invoked from the JSP page running on Tomcat 5.5. Having
said that, the problem only exists for a few users; most
users never see this problem when they use the JSP page to
invoke the ANT script and publish the exact same XML content.
Perplexing.
Given all this background, my plea to this list is simple:
What sort of conditions cause an XSL transformation to
duplicate an escaped character?
Would the system locale have an impact?
Would the Java version (1.5 versus 1.6) have an impact?
All source files use UTF-8 encoding.
All users are using the same XSL processor: Saxon 6.5.5.
I don't think the problem is in the XSL stylesheet or any
other part of the DITA Open Toolkit because all users are
using the same code and it works for most users.
Any ideas about his issue are appreciated.
Best regards,
Paul Anderson
Information Developer - Codex Administrator Compuware
Corporation The contents of this e-mail are intended for the
named addressee only. It contains information that may be
confidential. Unless you are the named addressee or an
authorized designee, you may not copy or use it, or disclose
it to anyone else. If you received it in error please notify
us immediately and then destroy it.
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail:
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--