xsl-list
[Top] [All Lists]

Re: [xsl] Xpath conundrum for the peeps

2014-08-22 04:09:53

You're thinking of this as an XPath problem when it's actually a requirements 
problem. Define the rules you want to implement, and we can help you express 
them in XPath (or advise on a more suitable language).

Your choice of rules will depend very much on the quality and variability of 
the data, and on whether you want to bias towards matching or not-matching in 
cases of doubt. Those are your decisions to make.

Michael Kay
Saxonica
mike(_at_)saxonica(_dot_)com
+44 (0) 118 946 5893




On 22 Aug 2014, at 08:31, Ihe Onwuka ihe(_dot_)onwuka(_at_)gmail(_dot_)com 
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:

Ok I admit I haven't thought about this in anger but a problem 
shared....etc...

I am doing a matching algorithm to match movie data from different 
repositories so that I know when the repositories are referencing the same 
movie even though they may hold different metadata.

It's not enough to match solely on title - one reason for that is movies have 
subtitles and may go by the subtitle in a different venue.

So let's say thanks to xsl:key I have in a variable $titles all the movies 
that have that title and in a variable $actors I have all the movies that 
that actor featured in.

A (out of many) criteria I could have is that if the data from the respective 
venues has it's title and an actor in common then they are the same movie, 
thats a plain intersect between $titles and $movies but I want something 
stronger than that. 

I want the data from the venues to match only if they have at least 2 actors 
in common. 



XSL-List info and archive
EasyUnsubscribe (by email)
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--
<Prev in Thread] Current Thread [Next in Thread>