xsl-list
[Top] [All Lists]

Re: [xsl] How to show context of term in text document?

2007-05-02 11:12:06
Following is a XSLT 2.0 solution for this:

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
                      xmlns:xs="http://www.w3.org/2001/XMLSchema";
                      xmlns:fn="http://dummy-ns";
                      version="2.0">

 <xsl:output method="text" />

 <xsl:template match="s">
   <xsl:variable name="result"
select="fn:getstring(normalize-space(string-join(//text(), ' ')),
'nutrient')" />
   <xsl:value-of select="$result" />
 </xsl:template>

 <xsl:function name="fn:getstring" as="xs:string">
   <xsl:param name="text" as="xs:string " />
   <xsl:param name="word" as="xs:string " />

   <xsl:variable name="x">
     <xsl:for-each select="tokenize(substring-before($text, $word), '
')[position() &lt;= 5]">
       <y><xsl:value-of select="." /></y>
     </xsl:for-each>
     <y><xsl:value-of select="concat('*', $word, '*')" /></y>
     <xsl:for-each select="tokenize(substring-after($text, $word), '
')[position() &lt;= 6]">
       <y><xsl:value-of select="." /></y>
     </xsl:for-each>
   </xsl:variable>
   <xsl:sequence select="concat('... ', string-join($x//y, ' '), ' ...')" />
 </xsl:function>

</xsl:stylesheet>

When this is applied to the XML:

<s id="39" xmlns:z="http://z-ns";>
 The aforementioned oxygenated
 <z:e sem="chebi" ids="33284">nutrient</z:e> emulsion (using the
 fluorocarbon FC-80) was placed in a Harvey pediatric oxygenator
 (volume 1230cc) and maintained at about 40 DEG C
</s>

The output produced is:

... The aforementioned oxygenated  *nutrient*  emulsion (using the
fluorocarbon FC-80) ...

This might not be perfect. But it could help you get started.

On 5/2/07, Antony Quinn <aquinn(_at_)ebi(_dot_)ac(_dot_)uk> wrote:
Hello,

I would like to display the context in which a term appears in some
text, in the same way that Google shows the 5 or so words before and
after your search term in the document.

Example
-------

For this input:

<s id="39">
  The aforementioned oxygenated
  <z:e sem="chebi" ids="33284">nutrient</z:e> emulsion (using the
  fluorocarbon FC-80) was placed in a Harvey pediatric oxygenator
  (volume 1230cc) and maintained at about 40 DEG C
</s>

I want to show the 5 words before and after "nutrient":

... The aforementioned oxygenated *nutrient* emulsion (using the
fluorocarbon FC-80) ...

Any ideas?

Thanks,

Antony


--
Regards,
Mukul Gandhi

--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--

<Prev in Thread] Current Thread [Next in Thread>