At 2012-12-14 16:52 -0800, Dan Vint wrote:
I've come across an interesting problem that I'd like some ideas to solve.
So I'm writing a series of XSLT stylesheets to convert s1000d issue
2.2 to issue 4. This content also has change markup that we want to
strip/resolve as we move forward.
...
So I then tried this that worked:
<xsl:text disable-output-escaping="yes">
<!--
</xsl:text>
<xsl:copy-of select="."/>
<xsl:text disable-output-escaping="yes">
-->
</xsl:text>
I worry that as soon as you start digging the
"disable-output-escaping" hole, you will only dig yourself deeper and
deeper creating problem after problem and missing nuance after
nuance. You will end up spending a lot of time creating a complex
encoding scheme that may or may not work with future data sets (you
did say that your authors were very creative).
For example, you cannot embed one CDATA section into another CDATA
section, rather, you have to embed one CDATA section into two CDATA
sections (or one plus some trailing markup). But then what you've
done is you've taken all the old markup and made it text! It will
end up showing up as output text after processing with
stylesheets. And how will you find what you added when it comes time
to remove it all.
And if you did use a comment, how would you know which comments to
remove from the end result while preserving the original comments?
And, anyway, the use-case for disable-output-escaping= is mark the
text in your output tree not to be escaped during serialization ...
it isn't meant to be used as a way to synthesize serialization markup
in the body of your end result. I have used it to synthesize
prologue information in the output, but that is self-contained and
doesn't get impacted by imaginative authored input. And any time in
the future when you optimize your pipeline by working on intermediate
trees instead of serialized Unicode files of markup and content, the
intermediate trees will not reflect the node structure of what you
need. The disable-output-escaping= is used when serializing to an
output entity, not when building a tree for subsequent processing.
I always try to remind my students that XSLT is a node processing
language not an angle-bracket processing language. As soon as you
think you need to work with angle brackets, think again because you
probably don't (or shouldn't!).
Anyone have ideas for an alternate solution?
Run your pipeline putting the old content you want to preserve into a
custom element in a custom namespace. Your new content then has both
the old content and the new content for you to visually do your
comparison at the end of your pipeline to see what has changed.
At those points in your pipeline where you need to use an S1000D
document model to validate an intermediate file, preprocess that file
to strip out your custom element so that it doesn't trigger any problems.
And that stripping stylesheet will be handy when you are all done in
order to remove the old content from the new file so that you only
have the new file.
And the stripping stylesheet is small: only two template rules. One
template rule catches all elements in your custom namespace and does
nothing with them. The other template rule is the idiomatic identity
template. Easy to write. Easy to use. Any time you need an
intermediate file to produce a final output of some kind, just
pre-process it and use your existing processes.
This is a scheme that doesn't use disable-output-escaping= and will
work whether you serialize your intermediate files to output entities
or pass intermediate trees from process to process. You don't have
to worry about writing your own XML serialization logic (in XSLT of
all languages!) and it will work regardless of what imaginative
markup comes from your authors.
I hope this helps.
. . . . . . . . . . . Ken
--
Contact us for world-wide XML consulting and instructor-led training
Free 5-hour lecture: http://www.CraneSoftwrights.com/links/udemy.htm
Crane Softwrights Ltd. http://www.CraneSoftwrights.com/s/
G. Ken Holman mailto:gkholman(_at_)CraneSoftwrights(_dot_)com
Google+ profile: https://plus.google.com/116832879756988317389/about
Legal business disclaimers: http://www.CraneSoftwrights.com/legal
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--