xsl-list
[Top] [All Lists]

Re: [xsl] randomly selecting a node from a node set

2013-01-05 20:29:12
You can use the random functions/templates of FXSL:


http://fxsl.sourceforge.net/articles/Random/Casting%20the%20Dice%20with%20FXSL-htm.htm


Cheers,

Dimitre


On Sat, Jan 5, 2013 at 4:18 PM, Graydon <graydon(_at_)marost(_dot_)ca> wrote:

So I have a bunch of boiled-down document structure that looks like:

<tree parent="para">
    <count count="2331">
    <child>text</child>
    <child>item</child>
    <child>item</child>
</tree>

<tree parent="para">
    <count count="548">
    <child>text</child>
    <child>subpara</child>
    <child>subpara</child>
    <child>subpara</child>
</tree>

And so on; all the parent-child patterns in a (largish, ~3.5 GB) content
set,
with the count element giving the frequency with which that pattern
occurs.

The overall structure of this document structure description is flat; no
tree
element contains another tree element.

This document is being used for human review, to know what needs to be
defined
for the output process, which is not my problem, and to hopefully
generate in
an automated way a test document for that same output processing, which
is my
problem.

To produce that generated test document, I need to do two things; I need
to
produce a nested version of the document structure, inserting in place of
the
<child> elements some existing pattern for that element name from the
available
pile of <tree/> elements with associated parent attributes having that
name as
the value and I then need to populate it with actual content.  (I do,
however,
get to lose the counts.)

So I need to produce, for the first para, something like:

<tree parent="para">
    <tree parent="text">
        <tree parent="italic"/>
    </tree>
    <tree parent="item">
        <tree parent="text"/>
    </tree>
    <tree parent="item">
        <tree parent="para">
            <tree parent="text"/>
        </tree>
    </tree>
</tree>

And then populate it from the actual content, so the tree elements are
replaced
with the elements whose names are the values of the parent attributes.

I think I can produce the tree; it's not, and isn't supposed to be,
acyclic,
but setting an arbitrary limit on the number of times it goes
section/block/section/block (or para/item/para, and so on) is acceptable.

What I don't know how to do is randomly select elements from the elements
available in the appropriate node set.

So I have to replace <child>text</child> with a tree element that has
one of the possible patterns for a text element; there are, say, eight
such patterns defined, so count(//tree[@parent eq 'text']) is equal to
eight.

I don't want _all_ the possible text element patterns; I just want one.
But I
certainly don't always want to get the first one, either.

So far as I know, unordered() may be implementation-dependent, but it's
allowed
to be and probably will be consistent, so there really isn't any
difference for
this purpose between //tree[@parent eq 'text'][1] and
unordered(//tree[@parent
eq 'text']), I'll always get the same one.

Anyone got suggestions as to how one picks a random node out of a node
sequence?

Thanks!
Graydon

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--




--
Cheers,
Dimitre Novatchev
---------------------------------------
Truly great madness cannot be achieved without significant intelligence.
---------------------------------------
To invent, you need a good imagination and a pile of junk
-------------------------------------
Never fight an inanimate object
-------------------------------------
To avoid situations in which you might make mistakes may be the
biggest mistake of all
------------------------------------
Quality means doing it right when no one is looking.
-------------------------------------
You've achieved success in your field when you don't know whether what
you're doing is work or play
-------------------------------------
Facts do not cease to exist because they are ignored.
-------------------------------------
Typing monkeys will write all Shakespeare's works in 200yrs.Will they
write all patents, too? :)
-------------------------------------
I finally figured out the only reason to be alive is to enjoy it.





--
Cheers,
Dimitre Novatchev
---------------------------------------
Truly great madness cannot be achieved without significant intelligence.
---------------------------------------
To invent, you need a good imagination and a pile of junk
-------------------------------------
Never fight an inanimate object
-------------------------------------
To avoid situations in which you might make mistakes may be the
biggest mistake of all
------------------------------------
Quality means doing it right when no one is looking.
-------------------------------------
You've achieved success in your field when you don't know whether what
you're doing is work or play
-------------------------------------
Facts do not cease to exist because they are ignored.
-------------------------------------
Typing monkeys will write all Shakespeare's works in 200yrs.Will they write
all patents, too? :)
-------------------------------------
I finally figured out the only reason to be alive is to enjoy it.

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--