You're thinking of this as an XPath problem when it's actually a requirements
problem. Define the rules you want to implement, and we can help you express
them in XPath (or advise on a more suitable language).
Your choice of rules will depend very much on the quality and variability of
the data, and on whether you want to bias towards matching or not-matching in
cases of doubt. Those are your decisions to make.
Michael Kay
Saxonica
mike(_at_)saxonica(_dot_)com
+44 (0) 118 946 5893
On 22 Aug 2014, at 08:31, Ihe Onwuka ihe(_dot_)onwuka(_at_)gmail(_dot_)com
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:
Ok I admit I haven't thought about this in anger but a problem
shared....etc...
I am doing a matching algorithm to match movie data from different
repositories so that I know when the repositories are referencing the same
movie even though they may hold different metadata.
It's not enough to match solely on title - one reason for that is movies have
subtitles and may go by the subtitle in a different venue.
So let's say thanks to xsl:key I have in a variable $titles all the movies
that have that title and in a variable $actors I have all the movies that
that actor featured in.
A (out of many) criteria I could have is that if the data from the respective
venues has it's title and an actor in common then they are the same movie,
thats a plain intersect between $titles and $movies but I want something
stronger than that.
I want the data from the venues to match only if they have at least 2 actors
in common.
XSL-List info and archive
EasyUnsubscribe (by email)
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--