xsl-list
[Top] [All Lists]

RE: [xsl] Matching a recursive local element structure

2011-02-05 05:56:46
Brandon: Can you summarize the algorithm you've got so far?
Two phase process.

Phase 1 (in java)
Walk the Schema using Apache Schema API, and construct an XML document
describing (simply) all types, elements, attributes.



Elements
   Add it
   For each Attribute
      Add as child
   For each Particle    
     Add as child
  [ TBD: For each Wildcard ... ???]

Particle 
   If element add as child
   If model group , for each element add as child

For non-recursive schemas this works and creates an XML document like this:

.... Snippet
      <jxon:element typeCategory="complex" contentType="element">
         <jxon:name uri="" localname="peer_reviewers"/>
         <jxon:element typeCategory="complex" contentType="element">
            <jxon:name uri="" localname="peer_reviewer"/>
            <jxon:element typeCategory="simple" variety="atomic">
               <jxon:name uri="" localname="name"/>
               <jxon:type uri="http://www.w3.org/2001/XMLSchema";
localname="string"/>
            </jxon:element>
            <jxon:element typeCategory="simple" variety="atomic">
               <jxon:name uri="" localname="degree"/>
               <jxon:type uri="http://www.w3.org/2001/XMLSchema";
localname="string"/>
            </jxon:element>
            <jxon:element typeCategory="complex" contentType="element">
               <jxon:name uri="" localname="title_affil"/>
               <jxon:element typeCategory="complex" contentType="mixed">
                  <jxon:name uri="" localname="para"/>
                  <jxon:element typeCategory="complex" contentType="simple">
                     <jxon:name uri="" localname="link"/>
                     <jxon:attribute typeCategory="simple" variety="atomic">
                        <jxon:name uri="" localname="type"/>
                        <jxon:type uri="http://www.w3.org/2001/XMLSchema";
localname="string"/>
                     </jxon:attribute>
                     <jxon:attribute typeCategory="simple" variety="atomic">
                        <jxon:name uri="" localname="app"/>
                        <jxon:type uri="http://www.w3.org/2001/XMLSchema";
localname="string"/>
                     </jxon:attribute>
                     <jxon:attribute typeCategory="simple" variety="atomic">
                        <jxon:name uri="" localname="target"/>
                        <jxon:type uri="http://www.w3.org/2001/XMLSchema";
localname="string"/>
                     </jxon:attribute>
--------------------

Phase 2:
An xquery program that reads this simplified schema dump and among many
things attempts to create a <xsl:template> to match each unique entry.

The relevant part for elements


(: Generate a match string for an element :)
declare function common:match_elem( $name as element() , $e as element() )
as xs:string
{

        fn:string-join( ($e/ancestor::*/jxon:name/@localname ,
$name/@localname) , "/" )

};
     

And no it hasn?t been enhanced for namespaces yet.  TBD.  (not hard I think,
I have the relevant info).

This produces simple match strings like
        peer_reviewers/peer_reviewer/name

and a template like
        <template match="peer_reviewers/peer_reviewer/name" priority="{depth
of match string}">


I use a priority to disambiguate cases where the raw element is also matched
like


        <template match="name" priority="{depth of match string}">

I want the more explicit match to take precedence.



This works perfectly as long as the schema is not recursive.
When it is recursive the Java part runs forever (until it runs out of
stack).
If I set an arbitrary recursive depth then the data is incomplete and it
works as long as instance documents never have element as deeply nested as
the recursion level.

So my thinking is I need to catch the recursion, mark it somehow and
generate match strings that indicate it.
To make things more exciting I also need to generate match strings for a
reverse transformation but that?s beyond the scope of this discussion.

My current thinking is to detect the first point of recursive entry into the
"loop" and mark that,  then go only 1 level deep.
Then at that position use a "//" instead of "/"
I don?t think this will be perfect but perhaps In combination with the
priorities may be close enough.







----------------------------------------
David A. Lee
dlee(_at_)calldei(_dot_)com
http://www.xmlsh.org

-----Original Message-----
From: Brandon Ibach [mailto:brandon(_dot_)ibach(_at_)single-sourcing(_dot_)com] 
Sent: Friday, February 04, 2011 10:48 PM
To: xsl-list
Subject: Re: [xsl] Matching a recursive local element structure


Yeah, still no joy from Sourceforge, yet.  Can you summarize the
algorithm you've got so far?

-Brandon :)


On Fri, Feb 4, 2011 at 8:27 PM, David Lee <dlee(_at_)calldei(_dot_)com> wrote:
Everything is checked into sourceforge but its giving me network errors
now
(under https://xmlsh.svn.sourceforge.net/svnroot/xmlsh/extensions/json,
This URl is giving me network errors
http://xmlsh.svn.sourceforge.net/viewvc/xmlsh/
But I don?t recommend it ... its quite complex and large.  I'm not really
asking for people read or  write this for me ... or analyze the code.
(if you did it would be awesome ! but it?s a challenging task) ..
The basic concept is I'm extracting from the XSD using Apache schema API a
'minimal' description of the element structure, then using xquery
attempting
to produce match expressions for each element (and attribute) declaration
and trying to avoid infinite recursion.   Its amazingly non-trivial.

Which is why I'm asking for is abstract ideas ...  and of course willing
to
accept abstract answers ... or none of course ...
I'm not asking for a solution just hoping maybe a suggestion on paths to
explore.
It just 'seems like its such an obvious problem' that people would have
run
into it before and just know if off the top of their heads ...
I'm hoping there is a 'simple pattern' that match expressions might 'match
up' with XSD structures in an 'obvious' way ...
But alas I suspect that may be asking too much.

My next thought is this might be best solved with a schema-aware xslt
expression,  but in the general case these may not be types, just
recursive
references.

Recursion is fun !



----------------------------------------
David A. Lee
dlee(_at_)calldei(_dot_)com
http://www.xmlsh.org


-----Original Message-----
From: Brandon Ibach 
[mailto:brandon(_dot_)ibach(_at_)single-sourcing(_dot_)com]
Sent: Friday, February 04, 2011 8:14 PM
To: xsl-list
Subject: Re: [xsl] Matching a recursive local element structure


Can we see the code you have so far?  It'd be a lot easier to address
specific issues in existing code than to philosophize about an
abstract approach.

-Brandon :)


On Fri, Feb 4, 2011 at 8:06 PM, David Lee <dlee(_at_)calldei(_dot_)com> wrote:
Thanks for the ideas (all!)
Let me restate my question maybe it might lead to another idea  (I'm
still
floundering !)

For every element declaration in an XSD I would like to generate a unique
XSLT match expression that matches that element declaration (but no
others).
I've got it working quite well for both global and local elements until I
hit a recursive structure then well ... it recurses :)

Thanks for any suggestions !

I *feel* this should be solvable because while the structure are
infinitely
recursive, each level of the recursion matches the same element
declaration
so shouldn?t have to be unrolled ... I just cant yet get my head around a
match expression to catch it right.

But maybe its not finitely solvable ?



----------------------------------------
David A. Lee
dlee(_at_)calldei(_dot_)com
http://www.xmlsh.org


-----Original Message-----
From: Brandon Ibach 
[mailto:brandon(_dot_)ibach(_at_)single-sourcing(_dot_)com]
Sent: Friday, February 04, 2011 7:44 PM
To: xsl-list
Subject: Re: [xsl] Matching a recursive local element structure


Perhaps this approach is not as generic as you may have had in mind,
but for this case, I think it would work.

<template match=?section/text//list/item[not(ancestor::subheading)]? > ?

-Brandon :)


On Fri, Feb 4, 2011 at 7:01 PM, David Lee <dlee(_at_)calldei(_dot_)com> 
wrote:
Suppose I have a schema which describes a recursive structure as local
elements.
Example (pseudo DTD, and pseudo xml I can provide more formal defs if
needed
)

Element section  (text)*
Element text ( list | para | bold | #PCDATA )*
Element list ( item*)
Element item ( text | subheading ) *
Element subheading (text)*

So for example doc may look like

<section>
  <text>Text
      <list>
                      <item><para>Item Text</para></item>
                      <item><para>Item Text2</para></item>
                      <item><para>Item Text</para>
      <list><item><para>More text> </item></list></para></item>
                </list>
   </text>
</section>


The key point is that the schema is recursive, so an xpath (or xslt
match)
might be

                section/text
                section/text/list/item/para

section/text/list/item/list/item/list/item/list/item/list/item ?. Can
get
really long here !!!!



Now suppose I want to avoid an infinite number of XSLT match strings but
I
want to match say ?list/item? but ONLY within section/text
(presume there may be a different list/item locally defined within say
subheader)


Suggestions on to a good way to do that ?

<template match=?section/text//list/item? > ?

But this might match
                section/text/subheading/list/item
or
                section/text/list/item/subheading/list/item


which I don?t want.

I only want to match the ?list/item? which is a local element definition
below ?section?  (recursively),.
so the match should select
                section/text/list/item/list/item/list/item
but not
                section/text/list/item/subheading/list/item

( which I would say match with
                subheading/list/item
                subheading/list/item/list/item
)


Is there an obvious way to do this ?
Its entirely possible that I?m asking an impossible question (that is
the
schemas may simply not allow this restriction in the first place),
But I?m trying to solve a general problem so asking a general question.

This is based on generating match strings from XSD element declarations
so
its really a XSD question as well ?
Maybe its impossible to describe a schema such that a descendant
?list/item?
is distinguishable if its under ?section? or ?subheading? ?

Thanks for any suggestion !


-David

----------------------------------------
David A. Lee
dlee(_at_)calldei(_dot_)com
http://www.xmlsh.org



----------------------------------------
David A. Lee
dlee(_at_)calldei(_dot_)com
http://www.xmlsh.org



--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--



--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--




--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--



--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--




--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--



--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--




--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--