Thanks Robert and Jarno for your help.
As we say in french "I'm re-inventing the wheel !"
I don't know about Jakarta Lucene, but as Robert said, I need to use Java to
run it, but unfortunately, my project don't use Java, it's only XML/XSL
files displaying HTML and ASP is used to :
- Générate a XML listing of all xml files
- add parameters to XSL stylesheets
- and using values from HTML forms
And indeed I've never learnt JAVA...
The search engine I want to do is almost ok, just a "few" things doesn't
work yet.
It can use it that way but if I find a solution for those "little" problems,
then I 'll be able to search in any xml files and even in many files (using
document(...))
and then make a engine for my whole project.
Actually i'd like to try going further in that direction...
So I let you know about thoses problems, which i did not manage to solve, if
you maybe have an idea...
Recapitulation of the problem
I want to search a string in a xml file and display the matched nodes.
The XML document looks like this :
<LIST>
<THEME label="Droit du travail" id="12"/>
<THEME label="droit social" id="2"/>
<THEME label="travail à la chaîne" id="34"/>
<THEME label="rien du tout" id="17"/>
</LIST>
The search engine only search on the labels attribute of the THEME nodes of
this xml document and it display the @label.
The xsl files does 2 things : it gives a HTML form so that the user write
the (next) searching words and it display the (former) result.
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt">
<xsl:output method="html" indent="yes"/>
<xsl:param name="String"/> <!-- string sent by the xslt processor -->
<xsl:template name="tokenizer">
<xsl:param name="text"
select="concat(normalize-space($string),' ')"/>
<xsl:if test="$text">
<xsl:for-each
select="THEME[contains(@label,substring-before($text, '
'))]">
<THEME id="{(_at_)id}" label="{(_at_)label}"/>
</xsl:for-each>
<xsl:call-template name="tokenizer">
<xsl:with-param name="text"
select="substring-after($text, ' ')"/>
</xsl:call-template>
</xsl:if>
</xsl:template>
<xsl:template match="LIST">
<form action="display.asp" method="POST">
<input type="text" name="UserString"/>
<input type="submit" value="Go!"/>
</form>
<xsl:variable name="NodeSetMErecherchees">
<xsl:call-template name="tokenizer"/>
</xsl:variable> <!-- I get here the matched THEME elements
corresponding
to the search in a node-set variable -->
<xsl:for-each select="msxsl:node-set($NodeSetMErecherchees)/ME">
<xsl:value-of select="@ref-ME"/><br/>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
The first problem is :
If the user search for "droit travail", the search engine will display twice
the first THEME id=12 because it contains 2 of the query words.
==> As you can see in the xsl I put the whole result in a node-set variable
which would in this case look like this :
<THEME id="12" label="Droit du travail"/> --> matching
"droit"
<THEME id="2" label="droit social"/> --> matching
"droit"
<THEME id="12" label="Droit du travail"/> -->
matching "travail"
<THEME id="34" label="travail à la chaîne"/> --> matching
"travail"
THEME id=12 apears twice...
So I 'd like now to delete the doubled THEME in this node-set variable...
But I don't see any way to say <xsl:for-each select="distinc(THEME)">
I could make a loop comparing for each node if there 's another who is the
same but it makes a lot to do to the processor and will make the engine
slower, is there another simple solution ?
The second problem is (it's less nescessary but would be great) :
I'd like to highlight the searched words in the displayed result...how to ?
Thanks for advices.
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list