xsl-list
[Top] [All Lists]

Re: [xsl] finding a minimum set of references

2013-01-09 02:03:07
If I've understood the problem correctly...

You can use a key to map a location (@area,@cite) to a sequence of
element(tree). Iterating over the tree elements, do not copy a loc
unless it is first in the sequence returned by that key: This
guarantees that all locs are covered. It may happen that this
eliminates all locs from a tree: if this must not happen, copy an
arbitrary loc.

(If the pass over all tree elements is written as a recursive
function, you may pass down a set of loc's that has been used, and
select one from this set if it can be used to avoid the empty set
within a tree.)

This may not produce the smalles set, but it should at least deal with
all necessities.

-W

On 09/01/2013, Graydon <graydon(_at_)marost(_dot_)ca> wrote:
So I have a lot (~600) tree elements that look like:

    <tree parent="attribution">
        <count count="1"/>
        <locations>
            <loc area="loiprop" cite="2010-07-prop10"/>
        </locations>
    </tree>
    <tree parent="block">
        <count count="10"/>
        <locations>
            <loc area="loiprop" cite="2004-09-bud2004"/>
            <loc area="loiprop" cite="2010-07-prop10"/>
            <loc area="loiprop" cite="2010-08-allprop"/>
        </locations>
    </tree>
    <tree parent="block">
        <key>section</key>
        <count count="6689"/>
        <locations>
            <loc area="CRCc945-en" cite="2606"/>
            <loc area="CRCc945-en" cite="402"/>
            <loc area="CRCc945-en" cite="6804"/>
            <loc area="CRCc945-en" cite="8301"/>
            <loc area="CRCc945-en" cite="8308.1"/>
        ....
        </locations>
        <child>section</child>
    </tree>

The @area,@cite pairs represent documents with the locations of particular
patterns (via the @parent and the child elements of the tree elements) of
documentation content, one tree element per pattern.  (So there are lots of
tree/@parent="block" elements, etc.)

What's wanted is the minimum number of documents that contain _all_ the
patterns, for output testing purposes.

So I need to produce the, or at least _a_, shortest list of <loc/> elements
so that every (all ~600) tree element contains at least one loc element on
the list.

This has turned out to be much less straightforward than I thought.  I can't
shake the feeling that there's a grouping solution, but I'm not seeing it if
there is.  So I'd appreciate any algorithm hints anybody has got;
unfortunately, efficiency is a concern.

Demonstration that this is really an NP-complete problem also grateful
accepted!

Thanks!
Graydon

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--



--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--