Using the wellknown XPath expression for set difference:
$ns1 - $ns2 =
$ns1[not(count(. | $ns2) = count($ns2))]
One would test to see if there is a non-empty set difference between the set
of all text-node descendents and those that are descendents of "llcd:vernac"
or "llcd:gloss" descendents of the current node:
The expression to test is:
descendent::text()[not(count(. | .//*[self::llcd:vernac or
self::llcd:gloss]//text())
=
count(.//*[self::llcd:vernac or
self::llcd:gloss]//text()))
]
Or quite more simple:
count(.//text()) != count(.//*[self::llcd:vernac or
self::llcd:gloss]//text())
My current test is
test=".//text()[not(ancestor::llcd:vernac | ancestor::llcd:gloss)]"
In the general case this is not correct, because it will permit "illegal"
text-nodes, which have an llcd:vernac or llcd:gloss ancestor, which is not a
descendent of the current node (but its ancestor).
Apart from this observation, a non-clever XSLT processor will build the
union in the predicate and this is quite expensive operation. I think it
would be more efficient to re-write the expression as:
.//text()[not(ancestor::llcd:vernac or ancestor::llcd:gloss)]
=====
Cheers,
Dimitre Novatchev.
http://fxsl.sourceforge.net/ -- the home of FXSL
"Lars Huttar" <lars_huttar(_at_)sil(_dot_)org> wrote in message
news:002c01c31a33$1a688ea0$250414ac(_at_)LarsandKate(_dot_)(_dot_)(_dot_)
Hi all,
My requirement is to check for validity of certain XML data as follows:
all text() nodes descended from . must be descendants of either
llcd:vernac
or llcd:gloss.
(By the way, if it helps, the llcd:vernac or llcd:gloss will be
descendants
of . too, not ancestors.)
My current test is
test=".//text()[not(ancestor::llcd:vernac | ancestor::llcd:gloss)]"
If this test is true, the data is invalid.
But is there a more efficient way to do this?
Something that checks for llcd:vernac|llcd:gloss along the way,
instead of going down the descendant axis and then back up the
ancestor axis (twice)? Something along the lines of
test="./(not(llcd:vernac|llcd:gloss)/)*/text()"
where * means "0 or more times".
I guess I could do
test="count(.//text()) >
count(.//llcd:vernac//text() | .//llcd:gloss//text())"
but I'm not sure that's any more efficient.
This is not a big deal, just wanting to be as efficient as reasonably
possible.
Lars
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list