xsl-list
[Top] [All Lists]

Re: [xsl] Preserving inline DTD

2014-01-28 08:41:16
Hi,

I'm afraid the OP is asking not about a system or public identifier on
a DOCTYPE declaration, but about preserving a DTD internal subset.

As David has remarked, this is not possible in unextended XSLT 1.0,
which was designed specifically for a defined use case: "XSLT is not
intended as a completely general-purpose XML transformation language.
Rather it is designed primarily for the kinds of transformations that
are needed when XSLT is used as part of XSL"
(http://www.w3.org/TR/xslt). Whenever "XSL" is used in a context like
this, we have to add (as the sentences in the Rec do also) that what
we mean by "XSL" is "XSLT + XSL-FO".

An XSL-FO processor has no need for a DTD internal subset; indeed in
that architecture one would ordinarily consider one to be irregular
and superfluous if not worse.

So the XSLT 1.0 answer is "extend your processor". Implement a custom
serialization method for your processor that does whatever you want.

The XSLT 2.0 answers might include "sniff the internal subset from the
input and fake it for the output". As Graydon suggests, you could use
unparsed-text() function for the sniffing part. For the rest, the
reason I say "fake it" is that I know of no off-the-shelf serializers
that will write a DTD internal subset, so in XSLT you'd have to use
disable-output-escaping, which we generally -- um -- frown on.

You could combine these answers: embed your XSLT 1.0 transformation in
a pipeline that would provide the serialization you want as a
post-process. You might choose not to use XSLT at all for the rest of
the pipeline.

What David C doesn't tell us is that he could implement such a
pipeline using Unix tools in just a few minutes. Of course, this gets
us into questions of platform dependencies, etc. There's also Ant and
such like.

One might mention XProc, except to open a can of worms, since XProc
has its own set of issues (and then we're off topic).

Cheers, Wendell


On Mon, Jan 27, 2014 at 6:56 PM, Graydon <graydon(_at_)marost(_dot_)ca> wrote:
On Mon, Jan 27, 2014 at 03:35:33PM -0800, Martin Holmes scripsit:
On 14-01-27 03:33 PM, David Carlisle wrote:
On 27/01/2014 23:26, Piotr Fusik wrote:
How do I make xsltproc preserve the DTD that is in the input XML ?

Unless it has a non-standard extension (which I don't recall is the
case) then this is not possible. Standard XSLT can not do this as the
DTD is expanded out by the XML parser and not reported to XSLT which
just sees a tree of element text and attribute nodes.

Couldn't the XSLT re-read the source document as text, using the
document() function, and recover the DTD section with string
manipulation?

document() will insist on parsing the document, so I don't think so, no.

If you have unparsed-text() (which xsltproc won't because it's XSLT 1.0)
you can do that to get the contents of the DOCTYPE declaration.

If it will always be the same DTD, or you know what DTD it will be at
run time, you can get the xsl:output to create a DOCTYPE declaration in
the result document by setting the doctype-public and possibly
doctype-system attributes on xsl:output to the values you want, which
might have been what the original question was about.

-- Graydon

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--




-- 
Wendell Piez | http://www.wendellpiez.com
XML | XSLT | electronic publishing
Eat Your Vegetables
_____oo_________o_o___ooooo____ooooooo_^

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--