xsl-list
[Top] [All Lists]

[xsl] XQuery Full Text Search

2007-11-23 13:43:11
Hey, I am looking for anyone with some experience or insight into so
called "full-text" searches on xml document text nodes. I have been
working through full-text examples using NUX (http://dsd.lbl.gov/nux/
which uses saxon for XQuery implementation and Lucene for full-text
indexing and searching via an extension) and i seem to be heading in the
right direction but am not quite there yet - so i am first interested in
know whether what i want to do is possible using using XQuery:

...given a document with text content that is spread out between
elements as such (following xml fragment), this can be easily rendered
out to html (and other formats too) where a "block" translates to a "p"
and an "inline" translates to a "span" -> paragraphs then can have an
html @class (via an optional block-properties/block-style element) and
the spans can have a constructed @style (via a set of
inline-properties/inline-style elements), so the text of the following
doc fragment would "read" in html as "This is a sample of a basic
paragraph with inline styles in WordML" with various translated html
styling:

      ...
       </block>
       <block>
         <block-properties>
           <block-style name="Heading1"/>
         </block-properties>
         <inline>
           <text>This is a sample of a </text>
         </inline>
         <inline>
           <inline-properties>
             <inline-style name="i"/>
           </inline-properties>
           <text>basic </text>
         </inline>
         <inline>
           <inline-properties>
             <inline-style name="i"/>
             <inline-style name="color" value="FF0000"/>
           </inline-properties>
           <text>paragraph</text>
         </inline>
         <inline>
           <inline-properties>
             <inline-style name="i"/>
           </inline-properties>
           <text> </text>
         </inline>
         <inline>
           <inline-properties>
             <inline-style name="b"/>
             <inline-style name="i"/>
           </inline-properties>
           <text>with</text>
         </inline>
         <inline>
           <inline-properties>
             <inline-style name="i"/>
           </inline-properties>
           <text> inline </text>
         </inline>
         <inline>
           <inline-properties>
             <inline-style name="i"/>
             <inline-style name="u" value="single"/>
           </inline-properties>
           <text>styles</text>
         </inline>
         <inline>
           <text> in WordML.</text>
         </inline>
       </block>
       <block>
      ...

so the question is can i (from a block node context) return the set of
inline nodes where any inline/text meets a search phrase. for instance
in the preceding example can i get the inline elements (the first three
in this case) that contain the phrase "This is a sample of a basic
paragraph" ??? and eventually i want to add the associated search
-result "highlighting" to those inline elements where the match was found.

So, is this within the scope of XQuery full-text search capability and i
just need to craft the XQuery properly (likely with some FLWR
expression) or is full-text search limited to text within a single node?

thanks, Peter




--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--

<Prev in Thread] Current Thread [Next in Thread>
  • [xsl] XQuery Full Text Search, Peter Doyle <=