xsl-list
[Top] [All Lists]

Re: [xsl] comparing XML document structure

2011-08-17 20:35:19
On Thu, Aug 18, 2011 at 12:44:02AM +0100, Tony Graham scripsit:
On Wed, August 17, 2011 11:48 pm, Wendell Piez wrote:
It sounds like you want to infer content models on the fly and then
validate against them. I can imagine approaches to this, but I doubt
that I'd trust many algorithms that actually attempted it -- not because
of XSLT, but because of the problem specifying the problem.
...
But why not use a schema? There are processors such as Trang that can
infer schemas from documents.

What Wendell said.

Using trang to generated a schema from the DTD in question has
historically tended to fail.  (Not a whole lot, but some; generally
usable for creating a schema to get saxon to validate the output, but
not usable on the fly for structure.)

So I've got a relatively fixed content model, in the form of a
comprehensive DTD and a much less comprehensive example of how to use
that DTD for a particular content type.

Initially, what I want to do is eat the exemplar, use it to generate a
parent child list -- so I'd have section/num, section/para, and
section/subsection -- and then take an output file and get the same list
from it, then compare the lists and produce a message for mis-matches.
So if a particular output file had section/num, section/subsection, and
section/list in it, for example, there should be an exception noted for
the presence of the list. (Valid, but not expected.)
...
On 8/17/2011 5:57 PM, Graydon wrote:
...
The desired goal is to be able to programmatically pull the structure,
at least to the extent of parent-child element pairs, from the
semantics-defining file, and compare that to each output file in turn.

So if the semantics-defining file gives an example section element,
which has num, para, and subsection element children, what I want to be
able to do is create a sequence of axis relationships and test the
section elements of the output for axis relationships that are not
members of that sequence.

It would help the rest of us wrap our heads around the problem if you
could provide a sample fragment of the "semantics-defining file" so we can
see what you are dealing with.

It would, but the whole NDA thing rears its ugly head.

It's just a document, to the same DTD as the output.  Instead of having
actual content in it, it has things like <para>This para is optional; if
present, it should contain introductory text</para> in it.

You may be able to create the tests you want in Schematron, but it's a bit
hard to tell without having an example to look at.  (If you can generate
Schematron from your definitions, you could directly create XSLT for the
axis tests about as easily, but the advantage could be that there are
tools such as XML IDEs that already understand the Schematron report
format.)

Schematron is certainly something to look at, yes.

Thanks!
Graydon

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--