xsl-list
[Top] [All Lists]

Re: [xsl] Spell Check Type Matching in XPath?

2022-04-21 15:18:37
Hi Michael,

Neat trick!

I wonder if this could somehow be combined with a multiple-pattern regular 
expression on the subsequent addition side:

^(\wabc|a\wbc|ab\wc|abc\w)$

 - Chris


-----Original Message-----
From: C. M. Sperberg-McQueen cmsmcq(_at_)blackmesatech(_dot_)com 
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> 
Sent: Thursday, April 21, 2022 4:12 PM
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Cc: C. M. Sperberg-McQueen <cmsmcq(_at_)blackmesatech(_dot_)com>
Subject: Re: [xsl] Spell Check Type Matching in XPath?

On Thu, 2022-04-21 at 19:01 +0000, Eliot Kimber 
eliot(_dot_)kimber(_at_)servicenow(_dot_)com wrote:
I’m looking at Jeni’s code now. I’ll see what I can do with it.
 
The fact that this is the best there is (a MarkMail search basically 
brought me to Mike’s response below), it suggests that there’s not 
something more obvious that I simply failed to see.

Non-obvious (at least to me), but possibly faster, given that you know already 
one of the strings to be matched, may be the symmetric-deletion approach to 
edit distance described by Wolf Garber [1]. It allows a fairly quick detection 
of whether the candidate string is within edit distance  1 of the string you're 
looking to match -- if you adjust the way you do it, you can detect strings 
within distance 2.

[1]
https://urldefense.com/v3/__https://wolfgarbe.medium.com/1000x-faster-spelling-correction-algorithm-2012-8701fcd87a5f__;!!A4F2R9G_pg!b_Q-osvTYVJ3QjfJHdVyHk4MMCSQPxpewiOj_JHhlnMR6lb6Aecmvwy4O2SsZNBYcD9eOeiwmQRJ7sg6LkO1_W7d5PZRrlt3LjPvzfqqC3rXsE9LDqpc$
 

Michael Sperberg-McQueen

 
However, Jeni’s comments in her post about recursion suggests there’s 
a way to improve the code in XSLT 3/XPath 3, maybe something using 
iterate….
 
Cheers,
 
E.
 
_____________________________________________
Eliot Kimber
Sr Staff Content Engineer
O: 512 554 9368
M: 512 554 9368
servicenow.com
LinkedIn | Twitter | YouTube | Facebook
 
From:Michael Kay mike(_at_)saxonica(_dot_)com
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com>
Date: Thursday, April 21, 2022 at 1:35 PM
To: xsl-list <xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com>
Subject: Re: [xsl] Spell Check Type Matching in XPath?
[External Email]
 
Jeni Tennison's work on computing Levenshtein distance in XSLT may be
relevant:
 
https://urldefense.com/v3/__http://www.jenitennison.com/2007/05/03/lev
enshtein-distance-in-xslt-2-0.html__;!!A4F2R9G_pg!b_Q-osvTYVJ3QjfJHdVy
Hk4MMCSQPxpewiOj_JHhlnMR6lb6Aecmvwy4O2SsZNBYcD9eOeiwmQRJ7sg6LkO1_W7d5P
ZRrlt3LjPvzfqqC3rXsGW2eCxv$
 
(It would be interesting to see it reworked for XSLT 3.0...)
 
Search also for "Levenshtein distance XSLT" on Markmail.
 
Michael Kay
Saxonica


On 21 Apr 2022, at 18:57, Eliot 
Kimbereliot(_dot_)kimber(_at_)servicenow(_dot_)com 
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:
 
I’m writing a Schematron rule that tries to identify URLs where the 
server component is close to, but not quite, “docs.servicenow.com”, 
i.e., “seivcenow” or “servcinow” or whatever. I also need to 
eliminate servers that are not like servicenow, such as 
“docs.amazon.com”.
 
Basically I want a the kind of fuzzy match on “servicenow” that 
you’d get with normal spell checking.
 
I’m not seeing an easy way to do this in XSLT/XPath (in the context 
of the XSLT Schematron engine in Oxygen XML).
 
But I feel like I’m missing some more-or-less obvious way to do this 
with regular expression or maybe a fold or something (I can use 
XPath 3).
 
What am I missing?
 
Thanks,
 
E.
_____________________________________________
Eliot Kimber
Sr Staff Content Engineer
O: 512 554 9368
M: 512 554 9368
servicenow.com
LinkedIn | Twitter | YouTube | Facebook XSL-List info and archive 
EasyUnsubscribe (by email)
 
XSL-List info and archive
EasyUnsubscribe (by email)
XSL-List info and archiveEasyUnsubscribe(by email)


--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--


<Prev in Thread] Current Thread [Next in Thread>