xsl-list
[Top] [All Lists]

Re: [xsl] comparing XML document structure

2011-08-17 17:48:56
Graydon,

It sounds like you want to infer content models on the fly and then validate against them. I can imagine approaches to this, but I doubt that I'd trust many algorithms that actually attempted it -- not because of XSLT, but because of the problem specifying the problem.

Falling short of the general case there are lots of things you could do:

<xsl:template match="section/*">
<xsl:variable name="expected-children" select="distinct-values(/path/to/specification/section/*/name()"/>
  <xsl:for-each select="*[not(name()=$expected-children)]">
    element <xsl:value-of select="name()"/> not expected here
  </xsl:for-each>
</xsl:template>

It's also possible to index elements by their names plus the names of parents (and additional criteria if necessary), if that's a help for retrieving things for comparison.

The bottom line is that while what you envision may not be practicable, that doesn't mean there aren't useful things that can be done.

But why not use a schema? There are processors such as Trang that can infer schemas from documents.

Cheers,
Wendell


On 8/17/2011 5:57 PM, Graydon wrote:
So I have an XML document which defines the expected semantics of the
XML output of an SGML-to-XML conversion project as an exemplar; there
are structures like this, and like these, and like that.

I also have a whole bunch of XML output which ought to conform to that
semantics.  (This output is the product of a complex, multi-pass, highly
conditional set of XSLT transforms.)

The desired goal is to be able to programmatically pull the structure,
at least to the extent of parent-child element pairs, from the
semantics-defining file, and compare that to each output file in turn.

So if the semantics-defining file gives an example section element,
which has num, para, and subsection element children, what I want to be
able to do is create a sequence of axis relationships and test the
section elements of the output for axis relationships that are not
members of that sequence.

I'm nearly certain I can't do that, but thought it was much wiser to ask
and allow for the possibility of a pleasant surprise.

-- Graydon

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail:<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--



--
======================================================================
Wendell Piez                            
mailto:wapiez(_at_)mulberrytech(_dot_)com
Mulberry Technologies, Inc.                http://www.mulberrytech.com
17 West Jefferson Street                    Direct Phone: 301/315-9635
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
----------------------------------------------------------------------
  Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--