xsl-list
[Top] [All Lists]

Re: DTD Subset Reduction Transformation

2002-08-30 12:28:07
Mike,

To do this you want a variant of the step-by-step identity transform:

<xsl:template match="node() | @*">
  <xsl:copy>
    <xsl:apply-templates
      select="node() | @*"/>
  </xsl:copy>
</xsl:template>

This template works by matching any node, copying it to the result tree, and then applying-templates to any attributes or children. It will just copy your larger.xml and the result will still validate to larger.dtd.

To remove the elements not in smaller.dtd, you have to know what they are. Let's say they are <bottle>, <glass> and <brush>. To the stylesheet containing the template above, you add another template like this:

<xsl:template match="bottle | glass | brush">
  <xsl:apply-templates/>
</xsl:template>

Since this template has a higher priority than the one matching node(), it gets fired in preference. Instead of copying its element node, however, it will simply descend to the next step in the tree, which you need so that any contents of <bottle>, for example, that do conform to smaller.dtd still get included.

Similar means can be used to strip out unwanted attributes; also the whole thing can be parameterized if you want to maintain your list of disallowed elements outside of the template match. Note however that the unmodified technique assumes you want *all* your content, just without the unwanted element markup.

The identity transform and various applications of it are described in DaveP's XSL FAQ (linked to from the page whose URL appears at the bottom of every XSL-List message ;-).

This will work with any conformant processor.

Good luck!

Cheers,
Wendell

At 01:41 PM 8/30/2002, you wrote:

I have an xml file that validates to a DTD. Let's call the DTD "larger.DTD".

I have another DTD which is a subset of larger.DTD, which we'll call
smaller.DTD.

I want to strip out a bunch of elements in that xml file, so that it
validates to smaller.DTD.

In other words, whatever elements/structures in the xml file that the
smaller.DTD doesn't recognize, remove them. (there will probably be some
information lossed of course.)

For example, transform an xhtml file into a chtml file. (or xhtml Strict to
xhtml Simple).

Is there an automated easy way to do this type of reduction transformation?

(I am using msxml parser, but if you don't use it then please just give me
the theory.)

thanks if you respond,
Mike


======================================================================
Wendell Piez                            
mailto:wapiez(_at_)mulberrytech(_dot_)com
Mulberry Technologies, Inc.                http://www.mulberrytech.com
17 West Jefferson Street                    Direct Phone: 301/315-9635
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
----------------------------------------------------------------------
  Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================


XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list



<Prev in Thread] Current Thread [Next in Thread>