On Tue, Jul 12, 2005 at 03:52:01AM -0700, Philip Guenther wrote:
Given that you can obviously locate a match for a string of 'n' non-star
characters in O(n*M), isn't the algorithm that Tim described a
worst-case of O(N*M) where N is the number of characters _other_ than
star in the pattern? (Stars are the cheap part, as they can never force
a backtrack!)
I thought about things by looking at a*b and the string abb. A non-greedy
match could match a with a, * with nothing, b with b, and then find out it
has to backtrack. A greedy match with a*b and abc would match a with a,
* with bc, backtrack, match * with b, fail on b and c. Things got worse
with more stars. As I said, a naive approach, but I was too focused on
regular expressions and failed to see how I can make use of wildcards
being a subset.
Now that Tim answered it all, it is obvious where exactly the difference
to regular expressions is, why backtracking can be replaced by string
search, that it is not related to greedyness and that in wildcard
expressions, stars are cheap if done right.
Michael